Skip to main content
Top

2024 | OriginalPaper | Chapter

Multimodal Sentiment Analysis Using Deep Learning: A Review

Authors : Shreya Patel, Namrata Shroff, Hemani Shah

Published in: Advancements in Smart Computing and Information Security

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Multimodal Sentiment Analysis (MSA) is a burgeoning field in natural language processing (NLP), also known as opinion mining. It determines sentiment(positive, negative, neutral), subjective opinion, emotional tone, sometimes even more fine-grained emotion like joy, anger, sadness, and others. The evolution of sentiment analysis from its early days of text only analysis to the incorporation of multimodal data has significantly enhanced the accuracy and depth of sentiment understanding. MSA is poised to play a pivotal role in extracting valuable insights from the vast amount of multimodal data generated in today’s digital age. Various fusion methods have been developed to combine information from different modalities effectively. Additionally, the field has seen significant contributions from lexical-based, machine learning-based, and deep learning-based approaches. Deep learning, in particular, has revolutionized MSA by enabling the creation of complex models that can effectively analyze sentiment from diverse data sources. This survey provides an overview of the critical developments in MSA, highlighting the evolution of methods. It also presents a comparative analysis of state-of-the-art models and their performance on benchmark datasets and future potential, helping researchers and practitioners choose the most suitable approach for their specific tasks. The surveyed models SKEAFN, TEDT, UniMSE, MMML and others have exhibited impressive performance across various datasets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Savla, M., Gopani, D., Ghuge, M., Chaudhari, S., Raundale, P.: Sentiment analysis of human speech using deep learning. In: 2023 3rd International Conference on Intelligent Technologies (CONIT), pp. 1–6. IEEE, June 2023 Savla, M., Gopani, D., Ghuge, M., Chaudhari, S., Raundale, P.: Sentiment analysis of human speech using deep learning. In: 2023 3rd International Conference on Intelligent Technologies (CONIT), pp. 1–6. IEEE, June 2023
2.
go back to reference Bhat, A., Mahar, R., Punia, R., Srivastava, R.: Exploring multimodal sentiment analysis through cartesian product approach using BERT embeddings and ResNet-50 encodings and comparing performance with pre-existing models. In: 2022 3rd International Conference for Emerging Technology (INCET), pp. 1–6. IEEE, May 2022 Bhat, A., Mahar, R., Punia, R., Srivastava, R.: Exploring multimodal sentiment analysis through cartesian product approach using BERT embeddings and ResNet-50 encodings and comparing performance with pre-existing models. In: 2022 3rd International Conference for Emerging Technology (INCET), pp. 1–6. IEEE, May 2022
3.
go back to reference Rao, A., Ahuja, A., Kansara, S., Patel, V.: Sentiment analysis on user-generated video, audio and text. In: 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), pp. 24–28. IEEE, February 2021 Rao, A., Ahuja, A., Kansara, S., Patel, V.: Sentiment analysis on user-generated video, audio and text. In: 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), pp. 24–28. IEEE, February 2021
4.
go back to reference Zhu, L., Zhu, Z., Zhang, C., Xu, Y., Kong, X.: Multimodal sentiment analysis based on fusion methods: a survey. Inf. Fusion 95, 306–325 (2023)CrossRef Zhu, L., Zhu, Z., Zhang, C., Xu, Y., Kong, X.: Multimodal sentiment analysis based on fusion methods: a survey. Inf. Fusion 95, 306–325 (2023)CrossRef
5.
go back to reference Agarwal, A., Yadav, A., Vishwakarma, D.K.: Multimodal sentiment analysis via RNN variants. In: 2019 IEEE International Conference on Big Data, Cloud Computing, Data Science and Engineering (BCD), pp. 19–23. IEEE, May 2019 Agarwal, A., Yadav, A., Vishwakarma, D.K.: Multimodal sentiment analysis via RNN variants. In: 2019 IEEE International Conference on Big Data, Cloud Computing, Data Science and Engineering (BCD), pp. 19–23. IEEE, May 2019
6.
go back to reference Boukabous, M., Azizi, M.: Multimodal sentiment analysis using audio and text for crime detection. In: 2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET), pp. 1–5. IEEE, March 2022 Boukabous, M., Azizi, M.: Multimodal sentiment analysis using audio and text for crime detection. In: 2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET), pp. 1–5. IEEE, March 2022
7.
go back to reference Lai, S., Hu, X., Li, Y., Ren, Z., Liu, Z., Miao, D.: Shared and private information learning in multimodal sentiment analysis with deep modal alignment and self-supervised multi-task learning. arXiv preprint arXiv:2305.08473 (2023) Lai, S., Hu, X., Li, Y., Ren, Z., Liu, Z., Miao, D.: Shared and private information learning in multimodal sentiment analysis with deep modal alignment and self-supervised multi-task learning. arXiv preprint arXiv:​2305.​08473 (2023)
8.
go back to reference Ma, J., Rong, L., Zhang, Y., Tiwari, P.: Moving from narrative to interactive multi-modal sentiment analysis: a survey. ACM Trans. Asian Low-Resourc. Lang. Inf. Process. (2023) Ma, J., Rong, L., Zhang, Y., Tiwari, P.: Moving from narrative to interactive multi-modal sentiment analysis: a survey. ACM Trans. Asian Low-Resourc. Lang. Inf. Process. (2023)
9.
go back to reference Poria, S., Majumder, N., Hazarika, D., Cambria, E., Gelbukh, A., Hussain, A.: Multimodal sentiment analysis: addressing key issues and setting up the baselines. IEEE Intell. Syst. 33(6), 17–25 (2018)CrossRef Poria, S., Majumder, N., Hazarika, D., Cambria, E., Gelbukh, A., Hussain, A.: Multimodal sentiment analysis: addressing key issues and setting up the baselines. IEEE Intell. Syst. 33(6), 17–25 (2018)CrossRef
10.
go back to reference Gandhi, A., Adhvaryu, K., Khanduja, V.: Multimodal sentiment analysis: review, application domains and future directions. In: 2021 IEEE Pune Section International Conference (PuneCon), pp. 1–5. IEEE, December 2021 Gandhi, A., Adhvaryu, K., Khanduja, V.: Multimodal sentiment analysis: review, application domains and future directions. In: 2021 IEEE Pune Section International Conference (PuneCon), pp. 1–5. IEEE, December 2021
11.
go back to reference Zadeh, A., Chen, M., Poria, S., Cambria, E., Morency, L.P.: Tensor fusion network for multimodal sentiment analysis. arXiv preprint arXiv:1707.07250 (2017) Zadeh, A., Chen, M., Poria, S., Cambria, E., Morency, L.P.: Tensor fusion network for multimodal sentiment analysis. arXiv preprint arXiv:​1707.​07250 (2017)
12.
go back to reference Liu, Z., Shen, Y., Lakshminarasimhan, V.B., Liang, P.P., Zadeh, A., Morency, L.P.: Efficient low-rank multimodal fusion with modality-specific factors. arXiv preprint arXiv:1806.00064 (2018) Liu, Z., Shen, Y., Lakshminarasimhan, V.B., Liang, P.P., Zadeh, A., Morency, L.P.: Efficient low-rank multimodal fusion with modality-specific factors. arXiv preprint arXiv:​1806.​00064 (2018)
13.
go back to reference Tsai, Y.H.H., Bai, S., Liang, P.P., Kolter, J.Z., Morency, L.P., Salakhutdinov, R.: Multimodal transformer for unaligned multimodal language sequences. In: Proceedings of the Conference. Association for Computational Linguistics. Meeting, vol. 2019, p. 6558. NIH Public Access, July 2019 Tsai, Y.H.H., Bai, S., Liang, P.P., Kolter, J.Z., Morency, L.P., Salakhutdinov, R.: Multimodal transformer for unaligned multimodal language sequences. In: Proceedings of the Conference. Association for Computational Linguistics. Meeting, vol. 2019, p. 6558. NIH Public Access, July 2019
14.
go back to reference Wang, Y., Shen, Y., Liu, Z., Liang, P.P., Zadeh, A., Morency, L.P.: Words can shift: dynamically adjusting word representations using nonverbal behaviors. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 7216–7223, July 2019 Wang, Y., Shen, Y., Liu, Z., Liang, P.P., Zadeh, A., Morency, L.P.: Words can shift: dynamically adjusting word representations using nonverbal behaviors. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 7216–7223, July 2019
15.
go back to reference Rahman, W., Hasan, M.K., Lee, S., Zadeh, A., Mao, C., Morency, L.P., Hoque, E.: Integrating multimodal information in large pretrained transformers. In: Proceedings of the Conference. Association for Computational Linguistics. Meeting, vol. 2020, p. 2359. NIH Public Access, July 2020 Rahman, W., Hasan, M.K., Lee, S., Zadeh, A., Mao, C., Morency, L.P., Hoque, E.: Integrating multimodal information in large pretrained transformers. In: Proceedings of the Conference. Association for Computational Linguistics. Meeting, vol. 2020, p. 2359. NIH Public Access, July 2020
16.
go back to reference Hazarika, D., Zimmermann, R., Poria, S.: Misa: Modality-invariant and-specific representations for multimodal sentiment analysis. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1122–1131, October 2020 Hazarika, D., Zimmermann, R., Poria, S.: Misa: Modality-invariant and-specific representations for multimodal sentiment analysis. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1122–1131, October 2020
17.
go back to reference Sun, Z., Sarma, P., Sethares, W., Liang, Y.: Learning relationships between text, audio, and video via deep canonical correlation for multimodal language analysis. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, pp. 8992–8999, April 2020 Sun, Z., Sarma, P., Sethares, W., Liang, Y.: Learning relationships between text, audio, and video via deep canonical correlation for multimodal language analysis. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, pp. 8992–8999, April 2020
18.
go back to reference Yu, W., Xu, H., Yuan, Z., Wu, J.: Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 12, pp. 10790–10797, May 2021 Yu, W., Xu, H., Yuan, Z., Wu, J.: Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 12, pp. 10790–10797, May 2021
19.
go back to reference Han, W., Chen, H., Poria, S.: Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis. arXiv preprint arXiv:2109.00412 (2021) Han, W., Chen, H., Poria, S.: Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis. arXiv preprint arXiv:​2109.​00412 (2021)
20.
go back to reference Hu, G., Lin, T.E., Zhao, Y., Lu, G., Wu, Y., Li, Y.: Unimse: towards unified multimodal sentiment analysis and emotion recognition. arXiv preprint arXiv:2211.11256 (2022) Hu, G., Lin, T.E., Zhao, Y., Lu, G., Wu, Y., Li, Y.: Unimse: towards unified multimodal sentiment analysis and emotion recognition. arXiv preprint arXiv:​2211.​11256 (2022)
21.
go back to reference Wang, F., et al.: TEDT: transformer-based encoding–decoding translation network for multimodal sentiment analysis. Cogn. Comput. 15(1), 289–303 (2023)CrossRef Wang, F., et al.: TEDT: transformer-based encoding–decoding translation network for multimodal sentiment analysis. Cogn. Comput. 15(1), 289–303 (2023)CrossRef
22.
go back to reference Kim, K., Park, S.: AOBERT: all-modalities-in-One BERT for multimodal sentiment analysis. Inf. Fusion 92, 37–45 (2023)CrossRef Kim, K., Park, S.: AOBERT: all-modalities-in-One BERT for multimodal sentiment analysis. Inf. Fusion 92, 37–45 (2023)CrossRef
23.
go back to reference Zhu, C., et al.: SKEAFN: sentiment knowledge enhanced attention fusion network for multimodal sentiment analysis. Inf. Fusion 100, 101958 (2023)CrossRef Zhu, C., et al.: SKEAFN: sentiment knowledge enhanced attention fusion network for multimodal sentiment analysis. Inf. Fusion 100, 101958 (2023)CrossRef
24.
go back to reference Li, Z., et al.: Multi-level correlation mining framework with self-supervised label generation for multimodal sentiment analysis. Inf. Fusion 101891 (2023) Li, Z., et al.: Multi-level correlation mining framework with self-supervised label generation for multimodal sentiment analysis. Inf. Fusion 101891 (2023)
25.
26.
go back to reference Zhao, Y., Mamat, M., Aysa, A., Ubul, K.: Multimodal sentiment system and method based on CRNN-SVM. Neural Comput. Appl. 1–13 (2023) Zhao, Y., Mamat, M., Aysa, A., Ubul, K.: Multimodal sentiment system and method based on CRNN-SVM. Neural Comput. Appl. 1–13 (2023)
27.
go back to reference Wang, D., Guo, X., Tian, Y., Liu, J., He, L., Luo, X.: TETFN: a text enhanced transformer fusion network for multimodal sentiment analysis. Pattern Recogn. 136, 109259 (2023)CrossRef Wang, D., Guo, X., Tian, Y., Liu, J., He, L., Luo, X.: TETFN: a text enhanced transformer fusion network for multimodal sentiment analysis. Pattern Recogn. 136, 109259 (2023)CrossRef
28.
go back to reference Kaur, R., Kautish, S.: Multimodal sentiment analysis: a survey and comparison. In: Research Anthology on Implementing Sentiment Analysis Across Multiple Disciplines, 1846–1870 (2022) Kaur, R., Kautish, S.: Multimodal sentiment analysis: a survey and comparison. In: Research Anthology on Implementing Sentiment Analysis Across Multiple Disciplines, 1846–1870 (2022)
29.
go back to reference Mäntylä, M.V., Graziotin, D., Kuutila, M.: The evolution of sentiment analysis—a review of research topics, venues, and top cited papers. Comput. Sci. Rev. 27, 16–32 (2018)CrossRef Mäntylä, M.V., Graziotin, D., Kuutila, M.: The evolution of sentiment analysis—a review of research topics, venues, and top cited papers. Comput. Sci. Rev. 27, 16–32 (2018)CrossRef
30.
go back to reference Birjali, M., Kasri, M., Beni-Hssane, A.: A comprehensive survey on sentiment analysis: approaches, challenges and trends. Knowl.-Based Syst. 226, 107134 (2021)CrossRef Birjali, M., Kasri, M., Beni-Hssane, A.: A comprehensive survey on sentiment analysis: approaches, challenges and trends. Knowl.-Based Syst. 226, 107134 (2021)CrossRef
31.
go back to reference Gandhi, A., Adhvaryu, K., Poria, S., Cambria, E., Hussain, A.: Multimodal sentiment analysis: a systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Inf. Fusion 91, 424–444 (2023)CrossRef Gandhi, A., Adhvaryu, K., Poria, S., Cambria, E., Hussain, A.: Multimodal sentiment analysis: a systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Inf. Fusion 91, 424–444 (2023)CrossRef
32.
go back to reference Soleymani, M., Garcia, D., Jou, B., Schuller, B., Chang, S.F., Pantic, M.: A survey of multimodal sentiment analysis. Image Vis. Comput. 65, 3–14 (2017)CrossRef Soleymani, M., Garcia, D., Jou, B., Schuller, B., Chang, S.F., Pantic, M.: A survey of multimodal sentiment analysis. Image Vis. Comput. 65, 3–14 (2017)CrossRef
Metadata
Title
Multimodal Sentiment Analysis Using Deep Learning: A Review
Authors
Shreya Patel
Namrata Shroff
Hemani Shah
Copyright Year
2024
DOI
https://doi.org/10.1007/978-3-031-59097-9_2

Premium Partner