Skip to main content

2022 | OriginalPaper | Buchkapitel

Combining Knowledge and Multi-modal Fusion for Meme Classification

verfasst von : Qi Zhong, Qian Wang, Ji Liu

Erschienen in: MultiMedia Modeling

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Internet memes are widespread on social media platforms such as Twitter and Facebook. Recently, meme classification has been an active research topic, especially meme sentiment classification and meme offensive classification. Internet memes contain multi-modal information, and the meme text is embedded in the meme image. The existing methods classify memes by simply concatenating global visual and textual features to generate a multi-modal representation. However, these approaches ignored the noise introduced by global visual features and the potential common information of meme multi-modal representation. In this paper, we propose a model for meme classification named MeBERT. Our method enhances the semantic representation of the meme by introducing conceptual information through external Knowledge Bases (KBs). Then, to reduce noise, a concept-image attention module is designed to extract concept-sensitive visual representation. In addition, a deep convolution tensor fusion module is built to effectively integrate multi-modal information. To verify the effectiveness of the model in the tasks of meme sentiment classification and meme offensive classification, we designed experiments on the Memotion and MultiOFF datasets. The experimental results show that the MeBERT model achieves better performance than state-of-the-art techniques for meme classification.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Bonheme, L., Grzes, M.: SESAM at SemEval-2020 task 8: investigating the relationship between image and text in sentiment analysis of memes. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 804–816 (2020) Bonheme, L., Grzes, M.: SESAM at SemEval-2020 task 8: investigating the relationship between image and text in sentiment analysis of memes. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 804–816 (2020)
4.
6.
Zurück zum Zitat Devlin, J., Chang, M.W., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Devlin, J., Chang, M.W., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:​1810.​04805 (2018)
7.
Zurück zum Zitat Gandhi, S., Kokkula, S., Chaudhuri, A., et al.: Image matters: detecting offensive and non-compliant content/logo in product images. arXiv preprint arXiv:1905.02234 (2019) Gandhi, S., Kokkula, S., Chaudhuri, A., et al.: Image matters: detecting offensive and non-compliant content/logo in product images. arXiv preprint arXiv:​1905.​02234 (2019)
9.
Zurück zum Zitat Guo, X., Ma, J., Zubiaga, A.: NUAA-QMUL at SemEval-2020 task 8: utilizing BERT and densenet for internet meme emotion analysis. arXiv preprint arXiv:2011.02788 (2020) Guo, X., Ma, J., Zubiaga, A.: NUAA-QMUL at SemEval-2020 task 8: utilizing BERT and densenet for internet meme emotion analysis. arXiv preprint arXiv:​2011.​02788 (2020)
10.
Zurück zum Zitat Guo, Y., Huang, J., Dong, Y., Xu, M.: Guoym at SemEval-2020 task 8: ensemble-based classification of visuo-lingual metaphor in memes. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 1120–1125 (2020) Guo, Y., Huang, J., Dong, Y., Xu, M.: Guoym at SemEval-2020 task 8: ensemble-based classification of visuo-lingual metaphor in memes. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 1120–1125 (2020)
11.
Zurück zum Zitat Keswani, V., Singh, S., Agarwal, S., Modi, A.: IITK at SemEval-2020 task 8: unimodal and bimodal sentiment analysis of internet memes. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 1135–1140 (2020) Keswani, V., Singh, S., Agarwal, S., Modi, A.: IITK at SemEval-2020 task 8: unimodal and bimodal sentiment analysis of internet memes. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 1135–1140 (2020)
12.
Zurück zum Zitat Kiela, D., Bhooshan, S., Firooz, H., et al.: Supervised multimodal bitransformers for classifying images and text. arXiv preprint arXiv:1909.02950 (2019) Kiela, D., Bhooshan, S., Firooz, H., et al.: Supervised multimodal bitransformers for classifying images and text. arXiv preprint arXiv:​1909.​02950 (2019)
13.
Zurück zum Zitat Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of ICLR (2015) Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of ICLR (2015)
14.
Zurück zum Zitat Li, L.H., Yatskar, M., Yin, D., et al.: VisualBERT: a simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557 (2019) Li, L.H., Yatskar, M., Yin, D., et al.: VisualBERT: a simple and performant baseline for vision and language. arXiv preprint arXiv:​1908.​03557 (2019)
15.
Zurück zum Zitat Lu, J., Batra, D., Parikh, D., Lee, S.: VilBERT: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 13–23 (2019) Lu, J., Batra, D., Parikh, D., Lee, S.: VilBERT: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 13–23 (2019)
18.
Zurück zum Zitat Sharma, C., et al.: SemEval-2020 task 8: memotion analysis-the visuo-lingual metaphor! In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 759–773 (2020) Sharma, C., et al.: SemEval-2020 task 8: memotion analysis-the visuo-lingual metaphor! In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 759–773 (2020)
19.
Zurück zum Zitat Sharma, M., Kandasamy, I., Vasantha, W.: Memebusters at SemEval-2020 task 8: feature fusion model for sentiment analysis on memes using transfer learning. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 1163–1171 (2020) Sharma, M., Kandasamy, I., Vasantha, W.: Memebusters at SemEval-2020 task 8: feature fusion model for sentiment analysis on memes using transfer learning. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 1163–1171 (2020)
20.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of ICLR (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of ICLR (2015)
21.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNet Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNet
22.
Zurück zum Zitat Suryawanshi, S., Chakravarthi, B.R., Arcan, M., Buitelaar, P.: Multimodal meme dataset (multioff) for identifying offensive content in image and text. In: Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying, pp. 32–41 (2020) Suryawanshi, S., Chakravarthi, B.R., Arcan, M., Buitelaar, P.: Multimodal meme dataset (multioff) for identifying offensive content in image and text. In: Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying, pp. 32–41 (2020)
24.
Zurück zum Zitat Wu, Y., Schuster, M., Chen, Z., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016) Wu, Y., Schuster, M., Chen, Z., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:​1609.​08144 (2016)
27.
Zurück zum Zitat Yuan, L., Wang, J., Zhang, X.: YNU-HPCC at SemEval-2020 task 8: using a parallel-channel model for memotion analysis. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 916–921 (2020) Yuan, L., Wang, J., Zhang, X.: YNU-HPCC at SemEval-2020 task 8: using a parallel-channel model for memotion analysis. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 916–921 (2020)
28.
Zurück zum Zitat Zadeh, A., Liang, P.P., Mazumder, N., Poria, S., Cambria, E., Morency, L.P.: Memory fusion network for multi-view sequential learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 5634–5641. AAAI Press (2018) Zadeh, A., Liang, P.P., Mazumder, N., Poria, S., Cambria, E., Morency, L.P.: Memory fusion network for multi-view sequential learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 5634–5641. AAAI Press (2018)
Metadaten
Titel
Combining Knowledge and Multi-modal Fusion for Meme Classification
verfasst von
Qi Zhong
Qian Wang
Ji Liu
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-030-98358-1_47

Premium Partner