Skip to main content
Erschienen in: Neural Processing Letters 3/2023

26.12.2022

Image–Text Sentiment Analysis Via Context Guided Adaptive Fine-Tuning Transformer

verfasst von: Xingwang Xiao, Yuanyuan Pu, Zhengpeng Zhao, Rencan Nie, Dan Xu, Wenhua Qian, Hao Wu

Erschienen in: Neural Processing Letters | Ausgabe 3/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Compared with single-modal content, multimodal content conveys user’s sentiments and feelings more vividly. Thus, multimodal sentiment analysis has become a research hotspot. Due to the flawed data-hungry of deep learning-based methods, transfer learning is extensively utilized. However, most transfer learning-based approaches transfer the model pre-trained on source domain to target domain by simply considering it as feature extractor (i.e., parameters are frozen) or applying global fine-tuning strategy (i.e., parameters are trainable) on it. This results in the loss of advantages of both source and target domains. In this paper, we propose a novel Context Guided Adaptive Fine-tuning Transformer (CGAFT) that investigates the strengths of both source and target domains adaptively to achieve image–text sentiment analysis. In CGAFT, a Context Guided Policy Network is first introduced to make optimal weights for each image–text instance. These weights indicate how much image sentiment information is necessary to be absorbed from each layer of the image model pre-trained on source domain and the parallel model fine-tuned on target domain. Then, image–text instance and its weights are fed into Sentiment Analysis Network to extract contextual image sentiment representations that are absorbed from both source and target domains to enhance the performance of image–text sentiment analysis. Besides, we observe that no publicly available image–text dataset is in Chinese. To fill this gap, we build an image–Chinese text dataset Flickr-ICT that contains 13,874 image–Chinese text pairs. The experiments conducted on three image–text datasets demonstrate that CGAFT outperforms strong baselines.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):1253CrossRef Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):1253CrossRef
2.
Zurück zum Zitat Yue L, Chen W, Li X, Zuo W, Yin M (2019) A survey of sentiment analysis in social media. Knowl Inf Syst 60(2):617–663CrossRef Yue L, Chen W, Li X, Zuo W, Yin M (2019) A survey of sentiment analysis in social media. Knowl Inf Syst 60(2):617–663CrossRef
3.
Zurück zum Zitat Cui H, Mittal V, Datar M (2006) Comparative experiments on sentiment classification for online product reviews. In: Proceedings of the 21st national conference on artificial intelligence, vol 2, pp 1265–1270 Cui H, Mittal V, Datar M (2006) Comparative experiments on sentiment classification for online product reviews. In: Proceedings of the 21st national conference on artificial intelligence, vol 2, pp 1265–1270
4.
Zurück zum Zitat Wei W, Gulla JA (2010) Sentiment learning on product reviews via sentiment ontology tree. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp 404–413 Wei W, Gulla JA (2010) Sentiment learning on product reviews via sentiment ontology tree. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp 404–413
5.
Zurück zum Zitat Tang D, Qin B, Liu T (2015) Learning semantic representations of users and products for document level sentiment classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), pp 1014–1023 Tang D, Qin B, Liu T (2015) Learning semantic representations of users and products for document level sentiment classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), pp 1014–1023
6.
Zurück zum Zitat Li X, Xie H, Chen L, Wang J, Deng X (2014) News impact on stock price return via sentiment analysis. Knowl Based Syst 69:14–23CrossRef Li X, Xie H, Chen L, Wang J, Deng X (2014) News impact on stock price return via sentiment analysis. Knowl Based Syst 69:14–23CrossRef
7.
Zurück zum Zitat Nguyen TH, Shirai K (2015) Topic modeling based sentiment analysis on social media for stock market prediction. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), pp 1354–1364 Nguyen TH, Shirai K (2015) Topic modeling based sentiment analysis on social media for stock market prediction. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), pp 1354–1364
8.
Zurück zum Zitat Abdi A, Shamsuddin SM, Hasan S, Piran J (2019) Deep learning-based sentiment classification of evaluative text based on multi-feature fusion. Inf Process Manag 56(4):1245–1259CrossRef Abdi A, Shamsuddin SM, Hasan S, Piran J (2019) Deep learning-based sentiment classification of evaluative text based on multi-feature fusion. Inf Process Manag 56(4):1245–1259CrossRef
9.
Zurück zum Zitat Yue Y (2019) Scale adaptation of text sentiment analysis algorithm in big data environment: Twitter as data source. In: International conference on big data analytics for cyber-physical-systems. Springer, pp 629–634 Yue Y (2019) Scale adaptation of text sentiment analysis algorithm in big data environment: Twitter as data source. In: International conference on big data analytics for cyber-physical-systems. Springer, pp 629–634
10.
Zurück zum Zitat Li G, Zheng Q, Zhang L, Guo S, Niu L (2020) Sentiment information based model for Chinese text sentiment analysis. In: 2020 IEEE 3rd international conference on automation, electronics and electrical engineering (AUTEEE). IEEE, pp 366–371 Li G, Zheng Q, Zhang L, Guo S, Niu L (2020) Sentiment information based model for Chinese text sentiment analysis. In: 2020 IEEE 3rd international conference on automation, electronics and electrical engineering (AUTEEE). IEEE, pp 366–371
11.
Zurück zum Zitat Kosti R, Alvarez JM, Recasens A, Lapedriza A (2017) Emotion recognition in context. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1960–1968 Kosti R, Alvarez JM, Recasens A, Lapedriza A (2017) Emotion recognition in context. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1960–1968
12.
Zurück zum Zitat Rao T, Li X, Zhang H, Xu M (2019) Multi-level region-based convolutional neural network for image emotion classification. Neurocomputing 333:429–439CrossRef Rao T, Li X, Zhang H, Xu M (2019) Multi-level region-based convolutional neural network for image emotion classification. Neurocomputing 333:429–439CrossRef
13.
Zurück zum Zitat Mittal N, Sharma D, Joshi ML (2018) Image sentiment analysis using deep learning. In: 2018 IEEE/WIC/ACM international conference on web intelligence (WI), pp 684–687 Mittal N, Sharma D, Joshi ML (2018) Image sentiment analysis using deep learning. In: 2018 IEEE/WIC/ACM international conference on web intelligence (WI), pp 684–687
14.
Zurück zum Zitat Ragusa E, Cambria E, Zunino R, Gastaldo P (2019) A survey on deep learning in image polarity detection: balancing generalization performances and computational costs. Electronics 8(7):66CrossRef Ragusa E, Cambria E, Zunino R, Gastaldo P (2019) A survey on deep learning in image polarity detection: balancing generalization performances and computational costs. Electronics 8(7):66CrossRef
15.
Zurück zum Zitat Kaur R, Kautish S (2019) Multimodal sentiment analysis: a survey and comparison. Int J Serv Sci Manag Eng Technol 10(2):38–58 Kaur R, Kautish S (2019) Multimodal sentiment analysis: a survey and comparison. Int J Serv Sci Manag Eng Technol 10(2):38–58
16.
Zurück zum Zitat Soleymani M, Garcia D, Jou B, Schuller B, Chang S-F, Pantic M (2017) A survey of multimodal sentiment analysis. Image Vis Comput 65:3–14CrossRef Soleymani M, Garcia D, Jou B, Schuller B, Chang S-F, Pantic M (2017) A survey of multimodal sentiment analysis. Image Vis Comput 65:3–14CrossRef
17.
Zurück zum Zitat Bengio Y (2012) Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML workshop on unsupervised and transfer learning. JMLR workshop and conference proceedings, pp 17–36 Bengio Y (2012) Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML workshop on unsupervised and transfer learning. JMLR workshop and conference proceedings, pp 17–36
18.
Zurück zum Zitat Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2021) A comprehensive survey on transfer learning. Proc IEEE 109(1):43–76CrossRef Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2021) A comprehensive survey on transfer learning. Proc IEEE 109(1):43–76CrossRef
19.
Zurück zum Zitat Liu R, Shi Y, Ji C, Jia M (2019) A survey of sentiment analysis based on transfer learning. IEEE Access 7:85401–85412CrossRef Liu R, Shi Y, Ji C, Jia M (2019) A survey of sentiment analysis based on transfer learning. IEEE Access 7:85401–85412CrossRef
20.
Zurück zum Zitat Li Z, Fan Y, Jiang B, Lei T, Liu W (2019) A survey on sentiment analysis and opinion mining for social multimedia. Multimed Tools Appl 78(6):6939–6967CrossRef Li Z, Fan Y, Jiang B, Lei T, Liu W (2019) A survey on sentiment analysis and opinion mining for social multimedia. Multimed Tools Appl 78(6):6939–6967CrossRef
21.
Zurück zum Zitat Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255 Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255
22.
Zurück zum Zitat Hu A, Flaxman S (2018) Multimodal sentiment analysis to explore the structure of emotions. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, pp 350–358 Hu A, Flaxman S (2018) Multimodal sentiment analysis to explore the structure of emotions. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, pp 350–358
23.
Zurück zum Zitat Thuseethan S, Janarthan S, Rajasegarar S, Kumari P, Yearwood J (2020) Multimodal deep learning framework for sentiment analysis from text-image web data. In: 2020 IEEE/WIC/ACM international joint conference on web intelligence and intelligent agent technology (WI-IAT), pp 267–274 Thuseethan S, Janarthan S, Rajasegarar S, Kumari P, Yearwood J (2020) Multimodal deep learning framework for sentiment analysis from text-image web data. In: 2020 IEEE/WIC/ACM international joint conference on web intelligence and intelligent agent technology (WI-IAT), pp 267–274
24.
Zurück zum Zitat Basu P, Tiwari S, Mohanty J, Karmakar S (2020) Multimodal sentiment analysis of metoo tweets using focal loss (grand challenge). In: 2020 IEEE sixth international conference on multimedia big data (BigMM), pp 461–465 Basu P, Tiwari S, Mohanty J, Karmakar S (2020) Multimodal sentiment analysis of metoo tweets using focal loss (grand challenge). In: 2020 IEEE sixth international conference on multimedia big data (BigMM), pp 461–465
25.
Zurück zum Zitat Huang F, Zhang X, Zhao Z, Xu J, Li Z (2019) Image–text sentiment analysis via deep multimodal attentive fusion. Knowl Based Syst 167:26–37CrossRef Huang F, Zhang X, Zhao Z, Xu J, Li Z (2019) Image–text sentiment analysis via deep multimodal attentive fusion. Knowl Based Syst 167:26–37CrossRef
26.
Zurück zum Zitat Xu N, Mao W (2017) Multisentinet: a deep semantic network for multimodal sentiment analysis. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 2399–2402 Xu N, Mao W (2017) Multisentinet: a deep semantic network for multimodal sentiment analysis. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 2399–2402
27.
Zurück zum Zitat Yang X, Feng S, Wang D, Zhang Y (2021) Image–text multimodal emotion classification via multi-view attentional network. IEEE Trans Multimed 23:4014–4026CrossRef Yang X, Feng S, Wang D, Zhang Y (2021) Image–text multimodal emotion classification via multi-view attentional network. IEEE Trans Multimed 23:4014–4026CrossRef
28.
Zurück zum Zitat Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781
29.
Zurück zum Zitat Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543 Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
30.
Zurück zum Zitat Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 1 (long and short papers), pp 4171–4186 Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 1 (long and short papers), pp 4171–4186
31.
Zurück zum Zitat Sun Y, Wang S, Li Y, Feng S, Tian H, Wu H, Wang H (2020) Ernie 2.0: a continual pre-training framework for language understanding. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8968–8975 Sun Y, Wang S, Li Y, Feng S, Tian H, Wu H, Wang H (2020) Ernie 2.0: a continual pre-training framework for language understanding. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8968–8975
32.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
33.
Zurück zum Zitat Kiela D, Bhooshan S, Firooz H, Perez E, Testuggine D (2019) Supervised multimodal bitransformers for classifying images and text. arXiv preprint arXiv:1909.02950 Kiela D, Bhooshan S, Firooz H, Perez E, Testuggine D (2019) Supervised multimodal bitransformers for classifying images and text. arXiv preprint arXiv:​1909.​02950
34.
Zurück zum Zitat Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Proceedings of the 27th international conference on neural information processing systems—volume 2, pp 3320–3328 Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Proceedings of the 27th international conference on neural information processing systems—volume 2, pp 3320–3328
35.
Zurück zum Zitat Tajbakhsh N, Shin JY, Gurudu SR, Hurst RT, Kendall CB, Gotway MB, Liang J (2016) Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging 35(5):1299–1312CrossRef Tajbakhsh N, Shin JY, Gurudu SR, Hurst RT, Kendall CB, Gotway MB, Liang J (2016) Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging 35(5):1299–1312CrossRef
36.
Zurück zum Zitat Azizpour H, Razavian AS, Sullivan J, Maki A, Carlsson S (2016) Factors of transferability for a generic convnet representation. IEEE Trans Pattern Anal Mach Intell 38(9):1790–1802CrossRef Azizpour H, Razavian AS, Sullivan J, Maki A, Carlsson S (2016) Factors of transferability for a generic convnet representation. IEEE Trans Pattern Anal Mach Intell 38(9):1790–1802CrossRef
37.
Zurück zum Zitat Guo Y, Shi H, Kumar A, Grauman K, Rosing T, Feris R (2019) Spottune: transfer learning through adaptive fine-tuning. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 4800–4809 Guo Y, Shi H, Kumar A, Grauman K, Rosing T, Feris R (2019) Spottune: transfer learning through adaptive fine-tuning. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 4800–4809
38.
Zurück zum Zitat Yuan L, Chen Y, Wang T, Yu W, Shi Y, Jiang Z-H, Tay FEH, Feng J, Yan S (2021) Tokens-to-token vit: training vision transformers from scratch on imagenet. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 558–567 Yuan L, Chen Y, Wang T, Yu W, Shi Y, Jiang Z-H, Tay FEH, Feng J, Yan S (2021) Tokens-to-token vit: training vision transformers from scratch on imagenet. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 558–567
39.
Zurück zum Zitat Niu T, Zhu S, Pang L, El Saddik A (2016) Sentiment analysis on multi-view social data. In: International conference on multimedia modeling. Springer, pp 15–27 Niu T, Zhu S, Pang L, El Saddik A (2016) Sentiment analysis on multi-view social data. In: International conference on multimedia modeling. Springer, pp 15–27
40.
Zurück zum Zitat Wu L, Qi M, Jian M, Zhang H (2019) Visual sentiment analysis by combining global and local information. Neural Process Lett 66:1–13 Wu L, Qi M, Jian M, Zhang H (2019) Visual sentiment analysis by combining global and local information. Neural Process Lett 66:1–13
41.
Zurück zum Zitat Ben Ahmed K, Bouhorma M, Ben Ahmed M, Radenski A (2016) Visual sentiment prediction with transfer learning and big data analytics for smart cities. In: 2016 4th IEEE international colloquium on information science and technology (CiSt), pp 800–805 Ben Ahmed K, Bouhorma M, Ben Ahmed M, Radenski A (2016) Visual sentiment prediction with transfer learning and big data analytics for smart cities. In: 2016 4th IEEE international colloquium on information science and technology (CiSt), pp 800–805
42.
Zurück zum Zitat Li W, Dong X, Wang Y (2021) Human emotion recognition with relational region-level analysis. IEEE Trans Aff Comput 66:1–1 Li W, Dong X, Wang Y (2021) Human emotion recognition with relational region-level analysis. IEEE Trans Aff Comput 66:1–1
43.
Zurück zum Zitat Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149CrossRef Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149CrossRef
44.
Zurück zum Zitat Zhou B, Lapedriza A, Torralba A, Oliva A (2017) Places: an image database for deep scene understanding. J Vis 17(10):296–296CrossRef Zhou B, Lapedriza A, Torralba A, Oliva A (2017) Places: an image database for deep scene understanding. J Vis 17(10):296–296CrossRef
45.
Zurück zum Zitat Zhang J, Chen M, Sun H, Li D, Wang Z (2020) Object semantics sentiment correlation analysis enhanced image sentiment classification. Knowl Based Syst 191:105245CrossRef Zhang J, Chen M, Sun H, Li D, Wang Z (2020) Object semantics sentiment correlation analysis enhanced image sentiment classification. Knowl Based Syst 191:105245CrossRef
46.
Zurück zum Zitat Zhang J, Liu X, Chen M, Ye Q, Wang Z (2021) Image sentiment classification via multi-level sentiment region correlation analysis. Neurocomputing 6:66 Zhang J, Liu X, Chen M, Ye Q, Wang Z (2021) Image sentiment classification via multi-level sentiment region correlation analysis. Neurocomputing 6:66
47.
Zurück zum Zitat Sagnika S, Mishra BSP, Meher SK (2020) Improved method of word embedding for efficient analysis of human sentiments. Multimed Tools Appl 79(43):32389–32413CrossRef Sagnika S, Mishra BSP, Meher SK (2020) Improved method of word embedding for efficient analysis of human sentiments. Multimed Tools Appl 79(43):32389–32413CrossRef
48.
Zurück zum Zitat Demotte P, Wijegunarathna K, Meedeniya D, Perera I (2021) Enhanced sentiment extraction architecture for social media content analysis using capsule networks. Multimed Tools Appl 66:1–26 Demotte P, Wijegunarathna K, Meedeniya D, Perera I (2021) Enhanced sentiment extraction architecture for social media content analysis using capsule networks. Multimed Tools Appl 66:1–26
49.
Zurück zum Zitat Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems. Curran Associates Inc., Red Hook, pp 6000–6010 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems. Curran Associates Inc., Red Hook, pp 6000–6010
50.
Zurück zum Zitat Kumar A, Gupta P, Balan R, Neti LBM, Malapati A (2021) Bert based semi-supervised hybrid approach for aspect and sentiment classification. Neural Process Lett 53(6):4207–4224CrossRef Kumar A, Gupta P, Balan R, Neti LBM, Malapati A (2021) Bert based semi-supervised hybrid approach for aspect and sentiment classification. Neural Process Lett 53(6):4207–4224CrossRef
51.
Zurück zum Zitat Mehrdad F, Mohammad G, Marzieh F, Mohammad M (2021) Parsbert: transformer-based model for Persian language understanding. Neural Process Lett 53(4):3831–3847 Mehrdad F, Mohammad G, Marzieh F, Mohammad M (2021) Parsbert: transformer-based model for Persian language understanding. Neural Process Lett 53(4):3831–3847
52.
Zurück zum Zitat Wang K, Wan X (2022) Counterfactual representation augmentation for cross-domain sentiment analysis. IEEE Trans Aff Comput 66:1–1 Wang K, Wan X (2022) Counterfactual representation augmentation for cross-domain sentiment analysis. IEEE Trans Aff Comput 66:1–1
53.
Zurück zum Zitat Guo H, Chi C, Zhan X (2021) Ernie-bilstm based Chinese text sentiment classification method. In: 2021 International conference on computer engineering and application (ICCEA), pp 84–88 Guo H, Chi C, Zhan X (2021) Ernie-bilstm based Chinese text sentiment classification method. In: 2021 International conference on computer engineering and application (ICCEA), pp 84–88
54.
Zurück zum Zitat Liang B, Su H, Gui L, Cambria E, Xu R (2022) Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl Based Syst 235:107643CrossRef Liang B, Su H, Gui L, Cambria E, Xu R (2022) Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl Based Syst 235:107643CrossRef
55.
Zurück zum Zitat Li R, Chen H, Feng F, Ma Z, Wang X, Hovy E (2021) Dual graph convolutional networks for aspect-based sentiment analysis. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (volume 1: long papers), pp 6319–6329 Li R, Chen H, Feng F, Ma Z, Wang X, Hovy E (2021) Dual graph convolutional networks for aspect-based sentiment analysis. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (volume 1: long papers), pp 6319–6329
56.
Zurück zum Zitat Majumder N, Poria S, Peng H, Chhaya N, Cambria E, Gelbukh A (2019) Sentiment and sarcasm classification with multitask learning. IEEE Intell Syst 34(3):38–43CrossRef Majumder N, Poria S, Peng H, Chhaya N, Cambria E, Gelbukh A (2019) Sentiment and sarcasm classification with multitask learning. IEEE Intell Syst 34(3):38–43CrossRef
57.
Zurück zum Zitat Xu N (2017) Analyzing multimodal public sentiment based on hierarchical semantic attentional network. In: 2017 IEEE international conference on intelligence and security informatics (ISI), pp 152–154 Xu N (2017) Analyzing multimodal public sentiment based on hierarchical semantic attentional network. In: 2017 IEEE international conference on intelligence and security informatics (ISI), pp 152–154
58.
Zurück zum Zitat Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9 Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9
59.
Zurück zum Zitat Seo S, Na S, Kim J (2020) Hmtl: heterogeneous modality transfer learning for audio-visual sentiment analysis. IEEE Access 8:140426–140437CrossRef Seo S, Na S, Kim J (2020) Hmtl: heterogeneous modality transfer learning for audio-visual sentiment analysis. IEEE Access 8:140426–140437CrossRef
60.
Zurück zum Zitat Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition
61.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: 2015 IEEE international conference on computer vision (ICCV), pp 1026–1034 He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: 2015 IEEE international conference on computer vision (ICCV), pp 1026–1034
62.
Zurück zum Zitat Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Proceedings of the 27th international conference on neural information processing systems, vol 1, pp 487–495 Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Proceedings of the 27th international conference on neural information processing systems, vol 1, pp 487–495
63.
Zurück zum Zitat Xu N, Mao W, Chen G (2018) A co-memory network for multimodal sentiment analysis. In: The 41st international ACM SIGIR conference on research and development in information retrieval, pp 929–932 Xu N, Mao W, Chen G (2018) A co-memory network for multimodal sentiment analysis. In: The 41st international ACM SIGIR conference on research and development in information retrieval, pp 929–932
64.
Zurück zum Zitat Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826 Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826
65.
Zurück zum Zitat Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A (2018) Places: a 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell 40(6):1452–1464CrossRef Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A (2018) Places: a 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell 40(6):1452–1464CrossRef
66.
Zurück zum Zitat Yu J, Chen K, Xia R (2022) Hierarchical interactive multimodal transformer for aspect-based multimodal sentiment analysis. IEEE Trans Aff Comput 66:1–1 Yu J, Chen K, Xia R (2022) Hierarchical interactive multimodal transformer for aspect-based multimodal sentiment analysis. IEEE Trans Aff Comput 66:1–1
67.
Zurück zum Zitat Yang X, Feng S, Zhang Y, Wang D (2021) Multimodal sentiment detection based on multi-channel graph neural networks. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (volume 1: long papers), pp 328–339 Yang X, Feng S, Zhang Y, Wang D (2021) Multimodal sentiment detection based on multi-channel graph neural networks. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (volume 1: long papers), pp 328–339
68.
Zurück zum Zitat Liao W, Zeng B, Liu J, Wei P, Fang J (2022) Image–text interaction graph neural network for image–text sentiment analysis. Appl Intell 52:1–15CrossRef Liao W, Zeng B, Liu J, Wei P, Fang J (2022) Image–text interaction graph neural network for image–text sentiment analysis. Appl Intell 52:1–15CrossRef
69.
Zurück zum Zitat Zhu T, Li L, Yang J, Zhao S, Liu H, Qian J (2022) Multimodal sentiment analysis with image–text interaction network. IEEE Trans Multimed 66:1–1 Zhu T, Li L, Yang J, Zhao S, Liu H, Qian J (2022) Multimodal sentiment analysis with image–text interaction network. IEEE Trans Multimed 66:1–1
70.
Zurück zum Zitat Han W, Chen H, Poria S (2021) Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 9180–9192 Han W, Chen H, Poria S (2021) Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 9180–9192
71.
Zurück zum Zitat Cambria E, Howard N, Hsu J, Hussain A (2013) Sentic blending: scalable multimodal fusion for the continuous interpretation of semantics and sentics. In: 2013 IEEE symposium on computational intelligence for human-like intelligence (CIHLI), pp 108–117 Cambria E, Howard N, Hsu J, Hussain A (2013) Sentic blending: scalable multimodal fusion for the continuous interpretation of semantics and sentics. In: 2013 IEEE symposium on computational intelligence for human-like intelligence (CIHLI), pp 108–117
72.
Zurück zum Zitat Yu W, Xu H, Yuan Z, Wu J (2021) Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 10790–10797 Yu W, Xu H, Yuan Z, Wu J (2021) Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 10790–10797
73.
Zurück zum Zitat Yang B, Wu L, Zhu J, Shao B, Lin X, Liu T-Y (2022) Multimodal sentiment analysis with two-phase multi-task learning. IEEE/ACM Trans Audio Speech Lang Process 30:2015–2024CrossRef Yang B, Wu L, Zhu J, Shao B, Lin X, Liu T-Y (2022) Multimodal sentiment analysis with two-phase multi-task learning. IEEE/ACM Trans Audio Speech Lang Process 30:2015–2024CrossRef
74.
Zurück zum Zitat Jiang D, Wei R, Liu H, Wen J, Tu G, Zheng L, Cambria E (2021) A multitask learning framework for multimodal sentiment analysis. In: 2021 International conference on data mining workshops (ICDMW), pp 151–157 Jiang D, Wei R, Liu H, Wen J, Tu G, Zheng L, Cambria E (2021) A multitask learning framework for multimodal sentiment analysis. In: 2021 International conference on data mining workshops (ICDMW), pp 151–157
75.
Zurück zum Zitat Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations
76.
Zurück zum Zitat Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 9992–10002 Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 9992–10002
77.
Zurück zum Zitat Wu K, Peng H, Chen M, Fu J, Chao H (2021) Rethinking and improving relative position encoding for vision transformer. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 10033–10041 Wu K, Peng H, Chen M, Fu J, Chao H (2021) Rethinking and improving relative position encoding for vision transformer. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 10033–10041
78.
Zurück zum Zitat Yang J, Sun M, Sun X (2017) Learning visual sentiment distributions via augmented conditional probability neural network. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 224–230 Yang J, Sun M, Sun X (2017) Learning visual sentiment distributions via augmented conditional probability neural network. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 224–230
79.
Zurück zum Zitat Borth D, Ji R, Chen T, Breuel T, Chang S-F (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceedings of the 21st ACM international conference on multimedia, pp 223–232 Borth D, Ji R, Chen T, Breuel T, Chang S-F (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceedings of the 21st ACM international conference on multimedia, pp 223–232
80.
Zurück zum Zitat Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM international conference on multimedia, pp 83–92 Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM international conference on multimedia, pp 83–92
Metadaten
Titel
Image–Text Sentiment Analysis Via Context Guided Adaptive Fine-Tuning Transformer
verfasst von
Xingwang Xiao
Yuanyuan Pu
Zhengpeng Zhao
Rencan Nie
Dan Xu
Wenhua Qian
Hao Wu
Publikationsdatum
26.12.2022
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 3/2023
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-11124-w

Weitere Artikel der Ausgabe 3/2023

Neural Processing Letters 3/2023 Zur Ausgabe

Neuer Inhalt