Skip to main content
Top

2020 | OriginalPaper | Chapter

Multimodal Aspect Extraction with Region-Aware Alignment Network

Authors : Hanqian Wu, Siliang Cheng, Jingjing Wang, Shoushan Li, Lian Chi

Published in: Natural Language Processing and Chinese Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Fueled by the rise of social media, documents on these platforms (e.g., Twitter, Weibo) are increasingly multimodal in nature, with images in addition to text. To well automatically analyze the opinion information inside multimodal data, it’s crucial to perform aspect term extraction (ATE) on them. However, until now, the researches focus on multimodal ATE are rare. In this study, we take a step further than previous studies by proposing a Region-aware Alignment Network (RAN) that aligns text with object regions that show in an image for the multimodal ATE task. Experiments on the Twitter dataset showcase the effectiveness of our proposed model. Further researches prove that our model has better performance when extracting emotion polarized aspect terms.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Arshad, O., Gallo, I., Nawaz, S., Calefati, A.: Aiding intra-text representations with visual context for multimodal named entity recognition. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, pp. 337–342 (2019) Arshad, O., Gallo, I., Nawaz, S., Calefati, A.: Aiding intra-text representations with visual context for multimodal named entity recognition. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, pp. 337–342 (2019)
2.
go back to reference Borth, D., Ji, R., Chen, T., Breuel, T.M., Chang, S.: Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: ACM Multimedia Conference, MM 2013, pp. 223–232 (2013) Borth, D., Ji, R., Chen, T., Breuel, T.M., Chang, S.: Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: ACM Multimedia Conference, MM 2013, pp. 223–232 (2013)
3.
go back to reference Chen, X., et al.: Aspect sentiment classification with document-level sentiment preference modeling. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3667–3677. Association for Computational Linguistics (2020) Chen, X., et al.: Aspect sentiment classification with document-level sentiment preference modeling. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3667–3677. Association for Computational Linguistics (2020)
4.
go back to reference Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019) Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019)
5.
go back to reference Gao, H., Mao, J., Zhou, J., Huang, Z., Wang, L., Xu, W.: Are you talking to a machine? Dataset and methods for multilingual image question. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, pp. 2296–2304 (2015) Gao, H., Mao, J., Zhou, J., Huang, Z., Wang, L., Xu, W.: Are you talking to a machine? Dataset and methods for multilingual image question. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, pp. 2296–2304 (2015)
6.
go back to reference He, R., Lee, W.S., Ng, H.T., Dahlmeier, D.: An unsupervised neural attention model for aspect extraction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Volume 1: Long Papers, pp. 388–397 (2017) He, R., Lee, W.S., Ng, H.T., Dahlmeier, D.: An unsupervised neural attention model for aspect extraction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Volume 1: Long Papers, pp. 388–397 (2017)
7.
go back to reference Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 2004, pp. 168–177 (2004) Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 2004, pp. 168–177 (2004)
8.
go back to reference Karpathy, A., Li, F.: Deep visual-semantic alignments for generating image descriptions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, vol. 2015, pp. 3128–3137 (2015) Karpathy, A., Li, F.: Deep visual-semantic alignments for generating image descriptions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, vol. 2015, pp. 3128–3137 (2015)
9.
go back to reference Liu, P., Joty, S.R., Meng, H.M.: Fine-grained opinion mining with recurrent neural networks and word embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, pp. 1433–1443 (2015) Liu, P., Joty, S.R., Meng, H.M.: Fine-grained opinion mining with recurrent neural networks and word embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, pp. 1433–1443 (2015)
10.
go back to reference Lu, D., Neves, L., Carvalho, V., Zhang, N., Ji, H.: Visual attention model for name tagging in multimodal social media. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Volume 1: Long Papers, pp. 1990–1999 (2018) Lu, D., Neves, L., Carvalho, V., Zhang, N., Ji, H.: Visual attention model for name tagging in multimodal social media. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Volume 1: Long Papers, pp. 1990–1999 (2018)
11.
go back to reference Lu, J., Yang, J., Batra, D., Parikh, D.: Hierarchical question-image co-attention for visual question answering. In: Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, pp. 289–297 (2016) Lu, J., Yang, J., Batra, D., Parikh, D.: Hierarchical question-image co-attention for visual question answering. In: Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, pp. 289–297 (2016)
12.
go back to reference Ma, X., Hovy, E.H.: End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Volume 1: Long Papers (2016) Ma, X., Hovy, E.H.: End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Volume 1: Long Papers (2016)
13.
go back to reference Nguyen, H., Shirai, K.: A joint model of term extraction and polarity classification for aspect-based sentiment analysis. In: 2018 10th International Conference on Knowledge and Systems Engineering (KSE), pp. 323–328 (2018) Nguyen, H., Shirai, K.: A joint model of term extraction and polarity classification for aspect-based sentiment analysis. In: 2018 10th International Conference on Knowledge and Systems Engineering (KSE), pp. 323–328 (2018)
14.
go back to reference Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, pp. 91–99 (2015) Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, pp. 91–99 (2015)
15.
go back to reference Wang, J., et al.: Aspect sentiment classification with both word-level and clause-level attention networks. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 4439–4445. International Joint Conferences on Artificial Intelligence Organization (2018) Wang, J., et al.: Aspect sentiment classification with both word-level and clause-level attention networks. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 4439–4445. International Joint Conferences on Artificial Intelligence Organization (2018)
16.
go back to reference Wang, J., et al.: Aspect sentiment classification towards question-answering with reinforced bidirectional attention network. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3548–3557. Association for Computational Linguistics (2019) Wang, J., et al.: Aspect sentiment classification towards question-answering with reinforced bidirectional attention network. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3548–3557. Association for Computational Linguistics (2019)
17.
go back to reference Wu, H., Wang, Z., Liu, M., Huang, J.: Multi-task learning based on question-answering style reviews for aspect category classification and aspect term extraction. In: Seventh International Conference on Advanced Cloud and Big Data, CBD, vol. 2019, pp. 272–278 (2019) Wu, H., Wang, Z., Liu, M., Huang, J.: Multi-task learning based on question-answering style reviews for aspect category classification and aspect term extraction. In: Seventh International Conference on Advanced Cloud and Big Data, CBD, vol. 2019, pp. 272–278 (2019)
18.
go back to reference Xu, H., Liu, B., Shu, L., Yu, P.S.: Double embeddings and CNN-based sequence labeling for aspect extraction. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 592–598. Association for Computational Linguistics (2018) Xu, H., Liu, B., Shu, L., Yu, P.S.: Double embeddings and CNN-based sequence labeling for aspect extraction. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 592–598. Association for Computational Linguistics (2018)
19.
go back to reference Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, pp. 2048–2057 (2015) Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, pp. 2048–2057 (2015)
20.
go back to reference Yang, H., Zeng, B., Yang, J., Song, Y., Xu, R.: A multi-task learning model for Chinese-oriented aspect polarity classification and aspect term extraction. CoRR abs/1912.07976 (2019) Yang, H., Zeng, B., Yang, J., Song, Y., Xu, R.: A multi-task learning model for Chinese-oriented aspect polarity classification and aspect term extraction. CoRR abs/1912.07976 (2019)
21.
go back to reference Yu, D., Fu, J., Mei, T., Rui, Y.: Multi-level attention networks for visual question answering. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 4187–4195 (2017) Yu, D., Fu, J., Mei, T., Rui, Y.: Multi-level attention networks for visual question answering. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 4187–4195 (2017)
22.
go back to reference Yu, Y., Lin, H., Meng, J., Zhao, Z.: Visual and textual sentiment analysis of a microblog using deep convolutional neural networks. Algorithms 9(2), 41 (2016)MathSciNetCrossRef Yu, Y., Lin, H., Meng, J., Zhao, Z.: Visual and textual sentiment analysis of a microblog using deep convolutional neural networks. Algorithms 9(2), 41 (2016)MathSciNetCrossRef
23.
go back to reference Zhang, Q., Fu, J., Liu, X., Huang, X.: Adaptive co-attention network for named entity recognition in tweets. In: AAAI Conference on Artificial Intelligence (2018) Zhang, Q., Fu, J., Liu, X., Huang, X.: Adaptive co-attention network for named entity recognition in tweets. In: AAAI Conference on Artificial Intelligence (2018)
Metadata
Title
Multimodal Aspect Extraction with Region-Aware Alignment Network
Authors
Hanqian Wu
Siliang Cheng
Jingjing Wang
Shoushan Li
Lian Chi
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-60450-9_12

Premium Partner