skip to main content
research-article

Attention-Based Modality-Gated Networks for Image-Text Sentiment Analysis

Published:05 July 2020Publication History
Skip Abstract Section

Abstract

Sentiment analysis of social multimedia data has attracted extensive research interest and has been applied to many tasks, such as election prediction and products evaluation. Sentiment analysis of one modality (e.g., text or image) has been broadly studied. However, not much attention has been paid to the sentiment analysis of multimodal data. Different modalities usually have information that is complementary. Thus, it is necessary to learn the overall sentiment by combining the visual content with text description. In this article, we propose a novel method—Attention-Based Modality-Gated Networks (AMGN)—to exploit the correlation between the modalities of images and texts and extract the discriminative features for multimodal sentiment analysis. Specifically, a visual-semantic attention model is proposed to learn attended visual features for each word. To effectively combine the sentiment information on the two modalities of image and text, a modality-gated LSTM is proposed to learn the multimodal features by adaptively selecting the modality that presents stronger sentiment information. Then a semantic self-attention model is proposed to automatically focus on the discriminative features for sentiment classification. Extensive experiments have been conducted on both manually annotated and machine weakly labeled datasets. The results demonstrate the superiority of our approach through comparison with state-of-the-art models.

References

  1. Kashif Ahmad, Mohamed Lamine Mekhalfi, Nicola Conci, Farid Melgani, and Francesco G. B. De Natale. 2018. Ensemble of deep models for event recognition. ACM Transactions on Multimedia Computing, Communications, and Applications 14, 2 (2018), Article 51, 20 pages. DOI:https://doi.org/10.1145/319Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2018. Bottom-up and top-down attention for image captioning and visual question answering. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE, Los ALamitos, CA, 6077--6086. DOI:https://doi.org/10.1109/CVPR.2018.00636Google ScholarGoogle ScholarCross RefCross Ref
  3. Muhammad Zubair Asghar, Fazal Masood Kundi, Shakeel Ahmad, Aurangzeb Khan, and Furqan Khan Saddozai. 2018. T-SAF: Twitter sentiment analysis framework using a hybrid classification scheme. Expert Systems 35, 1 (2018), 1--19. DOI:https://doi.org/10.1111/exsy.12233Google ScholarGoogle ScholarCross RefCross Ref
  4. Damian Borth, Rongrong Ji, Tao Chen, Thomas M. Breuel, and Shih-Fu Chang. 2013. Large-scale visual sentiment ontology and detectors using adjective noun pairs. In Proceedings of the ACM Conference on Multimedia (MM’13). ACM, New York, NY, 223--232. DOI:https://doi.org/10.1145/2502081.2502282Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Zheng Cai, Donglin Cao, and Rongrong Ji. 2015. Video (GIF) sentiment analysis using large-scale mid-level ontology. arxiv:1506.00765. http://arxiv.org/abs/1506.00765Google ScholarGoogle Scholar
  6. Minghai Chen, Sen Wang, Paul Pu Liang, Tadas Baltrusaitis, Amir Zadeh, and Louis-Philippe Morency. 2017. Multimodal sentiment analysis with word-level fusion and reinforcement learning. In Proceedings of the 19th ACM International Conference on Multimodal Interaction (ICMI’17). ACM, New York, NY, 163--171. DOI:https://doi.org/10.1145/3136755.3136801Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yuxiao Chen, Jianbo Yuan, Quanzeng You, and Jiebo Luo. 2018. Twitter sentiment analysis via bi-sense emoji embedding and attention-based LSTM. In Proceedings of the 2018 ACM Conference on Multimedia (MM’18). ACM, New York, NY, 117--125. DOI:https://doi.org/10.1145/3240508.3240533Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Raymond Chiong, Zongwen Fan, Zhongyi Hu, Marc T. P. Adam, Bernhard Lutz, and Dirk Neumann. 2018. A sentiment analysis-based machine learning approach for financial market prediction via news disclosures. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO’18). ACM, New York, NY, 278--279. DOI:https://doi.org/10.1145/3205651.3205682Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jessica Elan Chung and Eni Mustafaraj. 2011. Can collective sentiment expressed on Twitter predict political elections? In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI’11). http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3549.Google ScholarGoogle Scholar
  10. Hang Cui, Vibhu O. Mittal, and Mayur Datar. 2006. Comparative experiments on sentiment classification for online product reviews. In Proceedings of the 21st National Conference on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference. 1265--1270. http://www.aaai.org/Library/AAAI/2006/aaai06-198.php.Google ScholarGoogle Scholar
  11. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, Los Alamitos, CA, 248--255. DOI:https://doi.org/10.1109/CVPRW.2009.5206848Google ScholarGoogle ScholarCross RefCross Ref
  12. Namrata Godbole, Manja Srinivasaiah, and Steven Skiena. 2007. Large-scale sentiment analysis for news and blogs. In Proceedings of the 1st International Conference on Weblogs and Social Media (ICWSM’07). http://www.icwsm.org/papers/paper26.html.Google ScholarGoogle Scholar
  13. Anthony Hu and Seth R. Flaxman. 2018. Multimodal sentiment analysis to explore the structure of emotions. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’18). ACM, New York, NY, 350--358. DOI:https://doi.org/10.1145/3219819.3219853Google ScholarGoogle Scholar
  14. Feiran Huang, Xiaoming Zhang, Zhonghua Zhao, Jie Xu, and Zhoujun Li. 2019. Image-text sentiment analysis via deep multimodal attentive fusion. Knowledge-Based Systems 167 (2019), 26--37. DOI:https://doi.org/10.1016/j.knosys.2019.01.019Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Clayton J. Hutto and Eric Gilbert. 2014. VADER: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the 8th International Conference on Weblogs and Social Media (ICWSM’14). http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8109.Google ScholarGoogle Scholar
  16. Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15). 448--456. http://jmlr.org/proceedings/papers/v37/ioffe15.html.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Rongrong Ji, Fuhai Chen, Liujuan Cao, and Yue Gao. 2019. Cross-modality microblog sentiment prediction via bi-layer multimodal hypergraph learning. IEEE Transactions on Multimedia 21, 4 (2019), 1062--1075. DOI:https://doi.org/10.1109/TMM.2018.2867718Google ScholarGoogle ScholarCross RefCross Ref
  18. Xin Jin, Andrew C. Gallagher, Liangliang Cao, Jiebo Luo, and Jiawei Han. 2010. The wisdom of social multimedia: Using Flickr for prediction and forecast. In Proceedings of the 18th International Conference on Multimedia (MM’10). ACM, New York, NY, 1235--1244. DOI:https://doi.org/10.1145/1873951.1874196Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jin-Hwa Kim, Kyoung Woon On, Woosang Lim, Jeonghee Kim, Jung-Woo Ha, and Byoung-Tak Zhang. 2017. Hadamard product for low-rank bilinear pooling. In Proceedings of the 5th International Conference on Learning Representations (ICLR’17). https://openreview.net/forum?id=r1rhWnZkg.Google ScholarGoogle Scholar
  20. Yelin Kim and Emily Mower Provost. 2015. Emotion recognition during speech using dynamics of multiple regions of the face. ACM Transactions on Multimedia Computing, Communications, and Applications 12, 1s (2015), Article 25, 23 pages. DOI:https://doi.org/10.1145/2808204Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arxiv:1412.6980.http://arxiv.org/abs/1412.6980Google ScholarGoogle Scholar
  22. Srikumar Krishnamoorthy. 2018. Sentiment analysis of financial news articles using performance indicators. Knowledge and Information Systems 56, 2 (2018), 373--394. DOI:https://doi.org/10.1007/s10115-017-1134-1Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Chaozhuo Li, Senzhang Wang, Yukun Wang, Philip S. Yu, Yanbo Liang, Yun Liu, and Zhoujun Li. 2019. Adversarial learning for weakly-supervised social network alignment. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19), the 31st Conference on Innovative Applications of Artificial Intelligence (IAAI’19), and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’19). 996--1003. DOI:https://doi.org/10.1609/aaai.v33i01.3301996Google ScholarGoogle Scholar
  24. Linghui Li, Sheng Tang, Lixi Deng, Yongdong Zhang, and Qi Tian. 2017. Image caption with global-local attention. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI’17). 4133--4139. http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14880.Google ScholarGoogle Scholar
  25. Xiaodong Li, Haoran Xie, Li Chen, Jianping Wang, and Xiaotie Deng. 2014. News impact on stock price return via sentiment analysis. Knowledge-Based Systems 69 (2014), 14--23. DOI:https://doi.org/10.1016/j.knosys.2014.04.022Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ningning Liu, Emmanuel Dellandréa, Liming Chen, Chao Zhu, Yu Zhang, Charles-Edmond Bichot, Stéphane Bres, and Bruno Tellez. 2013. Multimodal recognition of visual concepts using histograms of textual concepts and selective weighted late fusion scheme. Computer Vision and Image Understanding 117, 5 (2013), 493--512. DOI:https://doi.org/10.1016/j.cviu.2012.10.009Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jiasen Lu, Caiming Xiong, Devi Parikh, and Richard Socher. 2017. Knowing when to look: Adaptive attention via a visual sentinel for image captioning. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, Los Alamitos, CA, 3242--3250. DOI:https://doi.org/10.1109/CVPR.2017.345Google ScholarGoogle ScholarCross RefCross Ref
  28. Qianren Mao, Jianxin Li, Senzhang Wang, Yuanning Zhang, Hao Peng, Min He, and Lihong Wang. 2019. Aspect-based sentiment classification with attentive neural turing machines. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). 5139--5145. DOI:https://doi.org/10.24963/ijcai.2019/714Google ScholarGoogle ScholarCross RefCross Ref
  29. Masoud Mazloom, Robert Rietveld, Stevan Rudinac, Marcel Worring, and Willemijn van Dolen. 2016. Multimodal popularity prediction of brand-related social media posts. In Proceedings of the 2016 ACM Conference on Multimedia (MM’16). ACM, New York, NY, 197--201. DOI:https://doi.org/10.1145/2964284.2967210Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Louis-Philippe Morency, Rada Mihalcea, and Payal Doshi. 2011. Towards multimodal sentiment analysis: Harvesting opinions from the web. In Proceedings of the 13th International Conference on Multimodal Interfaces (ICMI’11). ACM, New York, NY, 169--176. DOI:https://doi.org/10.1145/2070481.2070509Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jonghwan Mun, Minsu Cho, and Bohyung Han. 2017. Text-guided attention model for image captioning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI’17). 4233--4239. http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14888.Google ScholarGoogle Scholar
  32. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532--1543.Google ScholarGoogle Scholar
  33. Verónica Pérez-Rosas, Rada Mihalcea, and Louis-Philippe Morency. 2013. Utterance-level multimodal sentiment analysis. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL’13), Volume 1: Long Papers. 973--982. http://aclweb.org/anthology/P/P13/P13-1096.pdf.Google ScholarGoogle Scholar
  34. Soujanya Poria, Erik Cambria, and Alexander F. Gelbukh. 2015. Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 2539--2544. http://aclweb.org/anthology/D/D15/D15-1303.pdf.Google ScholarGoogle Scholar
  35. Soujanya Poria, Erik Cambria, Devamanyu Hazarika, Navonil Majumder, Amir Zadeh, and Louis-Philippe Morency. 2017. Multi-level multiple attentions for contextual multimodal sentiment analysis. In Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM’17). IEEE, Los Alamitos, CA, 1033--1038. DOI:https://doi.org/10.1109/ICDM.2017.134Google ScholarGoogle ScholarCross RefCross Ref
  36. Soujanya Poria, Iti Chaturvedi, Erik Cambria, and Amir Hussain. 2016. Convolutional MKL based multimodal emotion recognition and sentiment analysis. In Proceedings of the IEEE 16th International Conference on Data Mining (ICDM’16). IEEE, Los Alamitos, CA, 439--448. DOI:https://doi.org/10.1109/ICDM.2016.0055Google ScholarGoogle ScholarCross RefCross Ref
  37. Tingting Qiao, Jianfeng Dong, and Duanqing Xu. 2018. Exploring human-like attention supervision in visual question answering. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18), the 30th Conference on Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’18). 7300--7307. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16485.Google ScholarGoogle Scholar
  38. Yanghui Rao, Jingsheng Lei, Liu Wenyin, Qing Li, and Mingliang Chen. 2014. Building emotional dictionary for sentiment analysis of online news. World Wide Web 17, 4 (2014), 723--742. DOI:https://doi.org/10.1007/s11280-013-0221-9Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Robert Remus. 2013. ASVUniOfLeipzig: Sentiment analysis in Twitter using data-driven machine learning techniques. In Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval@NAACL-HLT’13). 450--454. http://aclweb.org/anthology/S/S13/S13-2074.pdf.Google ScholarGoogle Scholar
  40. Aliaksei Severyn and Alessandro Moschitti. 2015. Twitter sentiment analysis with deep convolutional neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 959--962. DOI:https://doi.org/10.1145/2766462.2767830Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.Google ScholarGoogle Scholar
  42. Duyu Tang, Furu Wei, Bing Qin, Ting Liu, and Ming Zhou. 2014. Coooolll: A deep learning system for Twitter sentiment classification. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval@COLING’14). 208--212. http://aclweb.org/anthology/S/S14/S14-2033.pdf.Google ScholarGoogle ScholarCross RefCross Ref
  43. Quoc-Tuan Truong and Hady W. Lauw. 2017. Visual sentiment analysis for review images with item-oriented and user-oriented CNN. In Proceedings of the 2017 ACM Conference on Multimedia (MM’17). ACM, New York, NY, 1274--1282. DOI:https://doi.org/10.1145/3123266.3123374Google ScholarGoogle Scholar
  44. Quoc-Tuan Truong and Hady W. Lauw. 2019. VistaNet: Visual aspect attention network for multimodal sentiment analysis. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19), the 31st Conference on Innovative Applications of Artificial Intelligence (IAAI’19), and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’19). 305--312. DOI:https://doi.org/10.1609/aaai.v33i01.3301305Google ScholarGoogle Scholar
  45. Andranik Tumasjan, Timm Oliver Sprenger, Philipp G. Sandner, and Isabell M. Welpe. 2010. Predicting elections with Twitter: What 140 characters reveal about political sentiment. In Proceedings of the 4th International Conference on Weblogs and Social Media (ICWSM’10). http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/view/1441.Google ScholarGoogle Scholar
  46. Sunny Verma, Chen Wang, Liming Zhu, and Wei Liu. 2019. DeepCU: Integrating both common and unique latent information for multimodal sentiment analysis. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). 3627--3634. DOI:https://doi.org/10.24963/ijcai.2019/503Google ScholarGoogle ScholarCross RefCross Ref
  47. Hao Wang, Dogan Can, Abe Kazemzadeh, François Bar, and Shrikanth Narayanan. 2012. A system for real-time Twitter sentiment analysis of 2012 U.S. presidential election cycle. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the System Demonstrations. 115--120. http://www.aclweb.org/anthology/P12-3020.Google ScholarGoogle Scholar
  48. Senzhang Wang, Xia Hu, Philip S. Yu, and Zhoujun Li. 2014. MMRate: Inferring multi-aspect diffusion networks with multi-pattern cascades. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, New York, NY, 1246--1255. DOI:https://doi.org/10.1145/2623330.2623728Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Xiaolong Wang, Furu Wei, Xiaohua Liu, Ming Zhou, and Ming Zhang. 2011. Topic sentiment analysis in Twitter: A graph-based hashtag sentiment classification approach. In Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM’11). ACM, New York, NY, 1031--1040. DOI:https://doi.org/10.1145/2063576.2063726Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Wei Wei and Jon Atle Gulla. 2010. Sentiment learning on product reviews via sentiment ontology tree. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL’10). 404--413. http://www.aclweb.org/anthology/P10-1042.Google ScholarGoogle Scholar
  51. Jie Wu, Haifeng Hu, and Yi Wu. 2018. Image captioning via semantic guidance attention and consensus selection strategy. ACM Transactions on Multimedia Computing, Communications, and Applications 14, 4 (2018), Article 87, 19 pages. DOI:https://doi.org/10.1145/3271485Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Hongtao Xie, Shancheng Fang, Zheng-Jun Zha, Yating Yang, Yan Li, and Yongdong Zhang. 2019. Convolutional attention networks for scene text recognition. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 1s (2019), Article 3, 17 pages. DOI:https://doi.org/10.1145/3231737Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Can Xu, Suleyman Cetintas, Kuang-Chih Lee, and Li-Jia Li. 2014. Visual sentiment prediction with deep convolutional neural networks. arXiv:1411.5731.Google ScholarGoogle Scholar
  54. Jie Xu, Feiran Huang, Xiaoming Zhang, Senzhang Wang, Chaozhuo Li, Zhoujun Li, and Yueying He. 2019. Sentiment analysis of social images via hierarchical deep fusion of content and links. Applied Soft Computing 80 (2019), 387--399. DOI:https://doi.org/10.1016/j.asoc.2019.04.010Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Jie Xu, Feiran Huang, Xiaoming Zhang, Senzhang Wang, Chaozhuo Li, Zhoujun Li, and Yueying He. 2019. Visual-textual sentiment classification with bi-directional multi-level attention networks. Knowledge-Based Systems 178 (2019), 61--73. DOI:https://doi.org/10.1016/j.knosys.2019.04.018Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15). 2048--2057. http://jmlr.org/proceedings/papers/v37/xuc15.html.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Nan Xu and Wenji Mao. 2017. MultiSentiNet: A deep semantic network for multimodal sentiment analysis. In Proceedings of the 2017 ACM Conference on Information and Knowledge Management (CIKM’17). ACM, New York, NY, 2399--2402. DOI:https://doi.org/10.1145/3132847.3133142Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Nan Xu, Wenji Mao, and Guandan Chen. 2019. Multi-interactive memory network for aspect based multimodal sentiment analysis. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19), the 31st Conference on Innovative Applications of Artificial Intelligence (IAAI’19), and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’19). 371--378. DOI:https://doi.org/10.1609/aaai.v33i01.3301371Google ScholarGoogle Scholar
  59. Quanzeng You, Liangliang Cao, Hailin Jin, and Jiebo Luo. 2016. Robust visual-textual sentiment analysis: When attention meets tree-structured recursive neural networks. In Proceedings of the 2016 ACM Conference on Multimedia (MM’16). ACM, New York, NY, 1008--1017. DOI:https://doi.org/10.1145/2964284.2964288Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Quanzeng You, Hailin Jin, and Jiebo Luo. 2017. Visual sentiment analysis by attending on local image regions. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI’17). 231--237. http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14964.Google ScholarGoogle Scholar
  61. Quanzeng You, Jiebo Luo, Hailin Jin, and Jianchao Yang. 2015. Robust image sentiment analysis using progressively trained and domain transferred deep networks. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15). 381--388. http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9556.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Quanzeng You, Jiebo Luo, Hailin Jin, and Jianchao Yang. 2016. Cross-modality consistent regression for joint visual-textual sentiment analysis of social multimedia. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining. ACM, New York, NY, 13--22. DOI:https://doi.org/10.1145/2835776.2835779Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Amir Zadeh, Paul Pu Liang, Navonil Mazumder, Soujanya Poria, Erik Cambria, and Louis-Philippe Morency. 2018. Memory fusion network for multi-view sequential learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18), the 30th Conference on Innovative Applications of Artificial Intelligence (IAAI’18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’18). 5634--5641. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17341.Google ScholarGoogle Scholar
  64. Dong Zhang, Shoushan Li, Qiaoming Zhu, and Guodong Zhou. 2019. Modeling the clause-level structure to multimodal sentiment analysis via reinforcement learning. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’19). IEEE, Los Alamitos, CA, 730--735. DOI:https://doi.org/10.1109/ICME.2019.00131Google ScholarGoogle ScholarCross RefCross Ref
  65. Yuxiang Zhang, Jiamei Fu, Dongyu She, Ying Zhang, Senzhang Wang, and Jufeng Yang. 2018. Text emotion distribution learning via multi-task convolutional neural network. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI’18). DOI:https://doi.org/10.24963/ijcai.2018/639Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Attention-Based Modality-Gated Networks for Image-Text Sentiment Analysis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 16, Issue 3
        August 2020
        364 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/3409646
        Issue’s Table of Contents

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 July 2020
        • Online AM: 7 May 2020
        • Accepted: 1 March 2020
        • Revised: 1 December 2019
        • Received: 1 August 2019
        Published in tomm Volume 16, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format