View in EDS

VABDC-Net: A framework for Visual-Caption Sentiment Recognition via spatio-depth visual attention and bi-directional caption processing.

Saved in:

Bibliographic Details
Title:	VABDC-Net: A framework for Visual-Caption Sentiment Recognition via spatio-depth visual attention and bi-directional caption processing.
Authors:	Pandey, Ananya¹ (AUTHOR) ananyaphdit08@gmail.com, Vishwakarma, Dinesh Kumar¹ (AUTHOR) dvishwakarma@gmail.com
Source:	Knowledge-Based Systems. Jun2023, Vol. 269, pN.PAG-N.PAG. 1p.
Subjects:	Convolutional neural networks, Deep learning
Abstract:	People are becoming accustomed to posting images and captions on social media platforms to express their opinions. Hence, Visual-Caption Sentiment Recognition (VCSR) has been a subject of growing attention recently. Thus, the correlation between visual and caption modalities is crucial for VCSR. However, most recent VCSR strategies concatenate features from the visual and caption modalities with the help of pre-trained deep learning models containing millions of trainable parameters without adding a dedicated attention module, ultimately leading to less desirable results. Motivated by this observation, we have proposed a novel model VABDC-Net, that integrates an attention module with the convolutional neural network to focus on the most relevant information from the visual modality and attentional tokenizer-based method to extract the most relevant contextual information from the caption modality. Demanding to this dire need, the following are the significant contributions of our experimentation: (1) an attentional tokenizer-based bi-directional caption branch to retrieve useful textual features from the captions, (2) an attentional visual branch to retrieve appropriate visual features, and (3) a cross-domain feature fusion to merge multi-modal features and predict sentiment. Thorough experimentation on two benchmark datasets, Twitter-15, with an accuracy of 83.80% , and Twitter-17, with an accuracy of 72.42% , indicates that our technique outperforms existing methods for VCSR. [ABSTRACT FROM AUTHOR]
	Copyright of Knowledge-Based Systems is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database:	Engineering Source

FullText	Text: Availability: 0
Header	DbId: egs DbLabel: Engineering Source An: 163185601 AccessLevel: 6 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 0
IllustrationInfo
Items	– Name: Title Label: Title Group: Ti Data: VABDC-Net: A framework for Visual-Caption Sentiment Recognition via spatio-depth visual attention and bi-directional caption processing. – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Pandey%2C+Ananya%22">Pandey, Ananya</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> ananyaphdit08@gmail.com</i><br /><searchLink fieldCode="AR" term="%22Vishwakarma%2C+Dinesh+Kumar%22">Vishwakarma, Dinesh Kumar</searchLink><relatesTo>1</relatesTo> (AUTHOR)<i> dvishwakarma@gmail.com</i> – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="JN" term="%22Knowledge-Based+Systems%22">Knowledge-Based Systems</searchLink>. Jun2023, Vol. 269, pN.PAG-N.PAG. 1p. – Name: Subject Label: Subjects Group: Su Data: <searchLink fieldCode="DE" term="%22Convolutional+neural+networks%22">Convolutional neural networks</searchLink><br /><searchLink fieldCode="DE" term="%22Deep+learning%22">Deep learning</searchLink> – Name: Abstract Label: Abstract Group: Ab Data: People are becoming accustomed to posting images and captions on social media platforms to express their opinions. Hence, Visual-Caption Sentiment Recognition (VCSR) has been a subject of growing attention recently. Thus, the correlation between visual and caption modalities is crucial for VCSR. However, most recent VCSR strategies concatenate features from the visual and caption modalities with the help of pre-trained deep learning models containing millions of trainable parameters without adding a dedicated attention module, ultimately leading to less desirable results. Motivated by this observation, we have proposed a novel model VABDC-Net, that integrates an attention module with the convolutional neural network to focus on the most relevant information from the visual modality and attentional tokenizer-based method to extract the most relevant contextual information from the caption modality. Demanding to this dire need, the following are the significant contributions of our experimentation: (1) an attentional tokenizer-based bi-directional caption branch to retrieve useful textual features from the captions, (2) an attentional visual branch to retrieve appropriate visual features, and (3) a cross-domain feature fusion to merge multi-modal features and predict sentiment. Thorough experimentation on two benchmark datasets, Twitter-15, with an accuracy of 83.80% , and Twitter-17, with an accuracy of 72.42% , indicates that our technique outperforms existing methods for VCSR. [ABSTRACT FROM AUTHOR] – Name: AbstractSuppliedCopyright Label: Group: Ab Data: <i>Copyright of Knowledge-Based Systems is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink	https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=egs&AN=163185601
RecordInfo	BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1016/j.knosys.2023.110515 Languages: – Code: eng Text: English PhysicalDescription: Pagination: PageCount: 1 StartPage: N.PAG Subjects: – SubjectFull: Convolutional neural networks Type: general – SubjectFull: Deep learning Type: general Titles: – TitleFull: VABDC-Net: A framework for Visual-Caption Sentiment Recognition via spatio-depth visual attention and bi-directional caption processing. Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Pandey, Ananya – PersonEntity: Name: NameFull: Vishwakarma, Dinesh Kumar IsPartOfRelationships: – BibEntity: Dates: – D: 07 M: 06 Text: Jun2023 Type: published Y: 2023 Identifiers: – Type: issn-print Value: 09507051 Numbering: – Type: volume Value: 269 Titles: – TitleFull: Knowledge-Based Systems Type: main
ResultId	1