VABDC-Net: A framework for Visual-Caption Sentiment Recognition via spatio-depth visual attention and bi-directional caption processing.

Saved in:
Bibliographic Details
Title: VABDC-Net: A framework for Visual-Caption Sentiment Recognition via spatio-depth visual attention and bi-directional caption processing.
Authors: Pandey, Ananya1 (AUTHOR) ananyaphdit08@gmail.com, Vishwakarma, Dinesh Kumar1 (AUTHOR) dvishwakarma@gmail.com
Source: Knowledge-Based Systems. Jun2023, Vol. 269, pN.PAG-N.PAG. 1p.
Subjects: Convolutional neural networks, Deep learning
Abstract: People are becoming accustomed to posting images and captions on social media platforms to express their opinions. Hence, Visual-Caption Sentiment Recognition (VCSR) has been a subject of growing attention recently. Thus, the correlation between visual and caption modalities is crucial for VCSR. However, most recent VCSR strategies concatenate features from the visual and caption modalities with the help of pre-trained deep learning models containing millions of trainable parameters without adding a dedicated attention module, ultimately leading to less desirable results. Motivated by this observation, we have proposed a novel model VABDC-Net, that integrates an attention module with the convolutional neural network to focus on the most relevant information from the visual modality and attentional tokenizer-based method to extract the most relevant contextual information from the caption modality. Demanding to this dire need, the following are the significant contributions of our experimentation: (1) an attentional tokenizer-based bi-directional caption branch to retrieve useful textual features from the captions, (2) an attentional visual branch to retrieve appropriate visual features, and (3) a cross-domain feature fusion to merge multi-modal features and predict sentiment. Thorough experimentation on two benchmark datasets, Twitter-15, with an accuracy of 83.80% , and Twitter-17, with an accuracy of 72.42% , indicates that our technique outperforms existing methods for VCSR. [ABSTRACT FROM AUTHOR]
Copyright of Knowledge-Based Systems is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
FullText Text:
  Availability: 0
Header DbId: egs
DbLabel: Engineering Source
An: 163185601
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: VABDC-Net: A framework for Visual-Caption Sentiment Recognition via spatio-depth visual attention and bi-directional caption processing.
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Pandey%2C+Ananya%22">Pandey, Ananya</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> ananyaphdit08@gmail.com</i><br /><searchLink fieldCode="AR" term="%22Vishwakarma%2C+Dinesh+Kumar%22">Vishwakarma, Dinesh Kumar</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> dvishwakarma@gmail.com</i>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="JN" term="%22Knowledge-Based+Systems%22">Knowledge-Based Systems</searchLink>. Jun2023, Vol. 269, pN.PAG-N.PAG. 1p.
– Name: Subject
  Label: Subjects
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Convolutional+neural+networks%22">Convolutional neural networks</searchLink><br /><searchLink fieldCode="DE" term="%22Deep+learning%22">Deep learning</searchLink>
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: People are becoming accustomed to posting images and captions on social media platforms to express their opinions. Hence, Visual-Caption Sentiment Recognition (VCSR) has been a subject of growing attention recently. Thus, the correlation between visual and caption modalities is crucial for VCSR. However, most recent VCSR strategies concatenate features from the visual and caption modalities with the help of pre-trained deep learning models containing millions of trainable parameters without adding a dedicated attention module, ultimately leading to less desirable results. Motivated by this observation, we have proposed a novel model VABDC-Net, that integrates an attention module with the convolutional neural network to focus on the most relevant information from the visual modality and attentional tokenizer-based method to extract the most relevant contextual information from the caption modality. Demanding to this dire need, the following are the significant contributions of our experimentation: (1) an attentional tokenizer-based bi-directional caption branch to retrieve useful textual features from the captions, (2) an attentional visual branch to retrieve appropriate visual features, and (3) a cross-domain feature fusion to merge multi-modal features and predict sentiment. Thorough experimentation on two benchmark datasets, Twitter-15, with an accuracy of 83.80% , and Twitter-17, with an accuracy of 72.42% , indicates that our technique outperforms existing methods for VCSR. [ABSTRACT FROM AUTHOR]
– Name: AbstractSuppliedCopyright
  Label:
  Group: Ab
  Data: <i>Copyright of Knowledge-Based Systems is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=163185601
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1016/j.knosys.2023.110515
    Languages:
      – Code: eng
        Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 1
        StartPage: N.PAG
    Subjects:
      – SubjectFull: Convolutional neural networks
        Type: general
      – SubjectFull: Deep learning
        Type: general
    Titles:
      – TitleFull: VABDC-Net: A framework for Visual-Caption Sentiment Recognition via spatio-depth visual attention and bi-directional caption processing.
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Pandey, Ananya
      – PersonEntity:
          Name:
            NameFull: Vishwakarma, Dinesh Kumar
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 07
              M: 06
              Text: Jun2023
              Type: published
              Y: 2023
          Identifiers:
            – Type: issn-print
              Value: 09507051
          Numbering:
            – Type: volume
              Value: 269
          Titles:
            – TitleFull: Knowledge-Based Systems
              Type: main
ResultId 1