nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Spatio-Temporal Attention Model Based on Multi-view for Social Relation Understanding

verfasst von : Jinna Lv, Bin Wu

Erschienen in: MultiMedia Modeling

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Social relation understanding is an increasingly popular research area. Great progress has been achieved by exploiting sentiment or social relation from image data, however, it is also difficult to attain satisfactory performance for social relation analysis from video data. In this paper, we propose a novel Spatio-Temporal attention model based on Multi-View (STMV) for understanding social relations from video. First, in order to obtain rich representation for social relation traits, we introduce different ConvNets to extract multi-view features including RGB, optical flow, and face. Second, we exploit temporal features of multi-view through time using Long Short-Term Memory (LSTM) for social relation understanding. Specially, we propose multiple attention units in our attention module. Through this manner, we can generate an appropriate feature representation focusing on multiple aspects of social relation traits from video, thus excellent mapping function from low-level video pixels to high-level social relation space can be built. Third, we introduce a tensor fusion layer, which learns interactions among multi-view features. Extensive experiments show that our STMV model achieves the state-of-the-art performance on the SRIV video dataset for social relation classification.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Soccer Video Event Detection Based on Deep Learning

Nächstes Kapitel Detail-Preserving Trajectory Summarization Based on Segmentation and Group-Based Filtering

Xiang, L., Sang, J., Xu, C.: Demographic attribute inference from social multimedia behaviors: a cross-OSN approach. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10132, pp. 515–526. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51811-4_42CrossRef

Alletto, S., Serra, G., Calderara, S.: Understanding social relationships in egocentric vision. Pattern Recognit. 48(12), 4082–4096 (2015)CrossRef

Tran, Q.D., Jung, J.E.: Cocharnet: extracting social networks using character co-occurrence in movies. J. Univers. Comput. 21(6), 796–815 (2015)

Weng, C.Y., Chu, W.T., Wu, J.L.: RoleNet: movie analysis from the perspective of social networks. IEEE Trans. Multimed. 11(2), 256–271 (2009)CrossRef

Hirai, T., Morishima, S.: Frame-wise continuity-based video summarization and stretching. In: Tian, Q., Sebe, N., Qi, G.-J., Huet, B., Hong, R., Liu, X. (eds.) MMM 2016. LNCS, vol. 9516, pp. 806–817. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-27671-7_67CrossRef

Mahasseni, B., Lam, M., Todorovic, S.: Unsupervised video summarization with adversarial LSTM networks. In: CVPR, pp. 2982–2991 (2017)

Sun, Q., Schiele, B., Fritz, M.: A domain based approach to social relation recognition. In: CVPR, pp. 435–444 (2017)

Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning social relation traits from face images. In: ICCV, pp. 3631–3639 (2015)

Bojanowski, P., Bach, F., Laptev, I., Ponce, J., Schmid, C., Sivic, J., Finding actors and actions in movies. In: ICCV, pp. 2280–2287 (2013)

10.

Lv, J., Liu, W., Zhou, L., Wu, B., Ma, H.: Multi-stream fusion model for social relation recognition from videos. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10704, pp. 355–368. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73603-7_29CrossRef

11.

Zhu, F., Li, H., Ouyang, W., Yu, N., Wang, X.: Learning spatial regularization with image-level supervisions for multi-label image classification. In: CVPR, pp. 2027–2036 (2017)

12.

You, Q., Jin, H., Wang, Z., Fang, C., Luo, J.: Image captioning with semantic attention. In: CVPR, pp. 4651–4659 (2016)

13.

Pan, Y., Yao, T., Li, H., Mei, T.: Video captioning with transferred semantic attributes. In: CVPR, pp. 984–992 (2017)

14.

Yu, H., Gui, L., Madaio, M., Ogan, A., Cassell, J., Morency, L.P.: Temporally selective attention model for social and affective state recognition in multimedia content. In: MM, pp. 1743–1751 (2017)

15.

Yang, Y., et al.: Mining competitive relationships by learning across heterogeneous networks. In: CIKM, pp. 1432–1441 (2012)

16.

Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: EMNLP, pp. 1412–1421 (2015)

17.

Long, X., Gan, C., de Melo, G., Wu, J., Liu, X., Wen, S.: Attention clusters: Purely attention based local feature integration for video classification. CoRR, abs/1711.09550 (2017)

18.

Zadeh, A., Liang, P.P., Poria, S., Vij, P., Cambria, E., Morency, L.: Multi-attention recurrent network for human communication comprehension. arXiv:1802.00923 (2018)

19.

Pei, W., Baltrusaitis, T., Tax, D.M.J., Morency, L.: Temporal attention-gated model for robust sequence classification. In: CVPR, pp. 820–829 (2017)

20.

Xu, C., Tao, D., Xu, C.: A survey on multi-view learning. CoRR, abs/1304.5634 (2013)

21.

Poria, S., Chaturvedi, I., Cambria, E., Hussain, A.: Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: ICDM, pp. 439–448 (2016)

22.

Nojavanasghari, B., Gopinath, D., Koushik, J., Baltrusaitis, T., Morency, L.: Deep multimodal fusion for persuasiveness prediction. In: ICMI, pp. 284–288 (2016)

23.

Du, T., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: CVPR, pp. 4489–4497 (2015)

24.

Findler, N.V.: Short note on a heuristic search strategy in long-term memory networks. Inf. Process. Lett. 1(5), 191–196 (1972)CrossRef

25.

Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_2CrossRef

26.

Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: CVPR, pp. 1891–1898 (2014)

Titel: Spatio-Temporal Attention Model Based on Multi-view for Social Relation Understanding
verfasst von: Jinna Lv
Bin Wu
Verlag: Springer International Publishing
Buch: MultiMedia Modeling
Print ISBN: 978-3-030-05715-2

Electronic ISBN: 978-3-030-05716-9

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-05716-9_32

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"