research-article

Multilingual Visual Sentiment Concept Matching

Authors:
Nikolaos Pappas

Idiap Research Institute, Martigny, Switzerland

Idiap Research Institute, Martigny, Switzerland
View Profile

,
Miriam Redi

Yahoo Inc., London, United Kingdom

Yahoo Inc., London, United Kingdom
View Profile

,
Mercan Topkara

JW Player, New York, NY, USA

JW Player, New York, NY, USA
View Profile

,
Brendan Jou

Columbia University, New York, NY, USA

Columbia University, New York, NY, USA
View Profile

,
Hongyi Liu

Columbia University, New York, NY, USA

Columbia University, New York, NY, USA
View Profile

,
Tao Chen

Columbia University, New York, NY, USA

Columbia University, New York, NY, USA
View Profile

,
Shih-Fu Chang

Columbia University, New York, NY, USA

Columbia University, New York, NY, USA
View Profile

ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia RetrievalJune 2016Pages 151–158https://doi.org/10.1145/2911996.2912016

Published:06 June 2016Publication History

ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval

Pages 151–158

ABSTRACT

The impact of culture in visual emotion perception has recently captured the attention of multimedia research. In this study, we provide powerful computational linguistics tools to explore, retrieve and browse a dataset of 16K multilingual affective visual concepts and 7.3M Flickr images. First, we design an effective crowdsourcing experiment to collect human judgements of sentiment connected to the visual concepts. We then use word embeddings to represent these concepts in a low dimensional vector space, allowing us to expand the meaning around concepts, and thus enabling insight about commonalities and differences among different languages. We compare a variety of concept representations through a novel evaluation task based on the notion of visual semantic relatedness. Based on these representations, we design clustering schemes to group multilingual visual concepts, and evaluate them with novel metrics based on the crowdsourced sentiment annotations as well as visual semantic relatedness. The proposed clustering framework enables us to analyze the full multilingual dataset in-depth and also show an application on a facial data subset, exploring cultural insights of portrait-related affective visual concepts.

References

B. Jou, T. Chen, N. Pappas, M. Redi, M. Topkara*, and S.-F. Chang, "Visual affect around the world: A large-scale multilingual visual sentiment ontology," in ACM International Conference on Multimedia, (Brisbane, Australia), pp. 159--168, 2015. Google ScholarDigital Library
H. Liu, B. Jou, T. Chen, M. Topkara, N. Pappas, M. Redi, and S.-F. Chang, "Complura: Exploring and leveraging a large-scale multilingual visual sentiment ontology," in ACM Interational Conference on Multimedia Retrieval, (New York, NY, USA), 2016. Google ScholarDigital Library
J. Turian, L. Ratinov, and Y. Bengio, "Word representations: A simple and general method for semi-supervised learning," in 48th Annual Meeting of the Association for Computational Linguistics, ACL '10, (Uppsala, Sweden), pp. 384--394, 2010. Google ScholarDigital Library
R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, "Natural language processing (almost) from scratch," Journal of Machine Learning Research, vol. 12, pp. 2493--2537, 2011. Google ScholarDigital Library
T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," CoRR, vol. abs/1301.3781, 2013.Google Scholar
J. Pennington, R. Socher, and C. D. Manning, "GloVe: Global vectors for word representation," in Empirical Methods in Natural Language Processing, pp. 1532--1543, 2014.Google Scholar
R. Al-Rfou, B. Perozzi, and S. Skiena, "Polyglot: Distributed word representations for multilingual NLP," CoRR, vol. abs/1307.1662, 2013.Google Scholar
A. Klementiev, I. Titov, and B. Bhattarai, "Inducing crosslingual distributed representations of words," in Proceedings of COLING 2012, (Mumbai, India), pp. 1459--1474, 2012.Google Scholar
W. Y. Zou, R. Socher, D. Cer, and C. D. Manning, "Bilingual word embeddings for phrase-based machine translation," in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, (Seattle, WA, USA), pp. 1393--1398, 2013.Google Scholar
K. M. Hermann and P. Blunsom, "Multilingual models for compositional distributed semantics," in Annual Meeting of the Association for Computational Linguistics, (Baltimore, Maryland), pp. 58--68, 2014.Google Scholar
A. P. S. Chandar, S. Lauly, H. Larochelle, M. M. Khapra, B. Ravindran, V. C. Raykar, and A. Saha, "An autoencoder approach to learning bilingual word representations," CoRR, vol. abs/1402.1454, 2014.Google Scholar
F. Hill, R. Reichart, and A. Korhonen, "Simlex-999: Evaluating semantic models with (genuine) similarity estimation," CoRR, vol. abs/1408.3456, 2014.Google Scholar
E. Bruni, N. K. Tran, and M. Baroni, "Multimodal distributional semantics," Journal of Artificial Intelligence Research, vol. 49, pp. 1--47, Jan. 2014. Google ScholarCross Ref
C. Silberer and M. Lapata, "Learning grounded meaning representations with autoencoders," in 52nd Annual Meeting of the Association for Computational Linguistics, (Baltimore, Maryland), pp. 721--732, June 2014.Google Scholar
A. Lazaridou, N. T. Pham, and M. Baroni, "Combining language and vision with a multimodal skip-gram model," in Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Denver, Colorado), pp. 153--163, 2015.Google Scholar
A. Karpathy, A. Joulin, and F. Li, "Deep fragment embeddings for bidirectional image sentence mapping," in Advances in Neural Information Processing Systems 27, pp. 1889--1897, Curran Associates, Inc., 2014.Google Scholar
R. Kiros, R. Salakhutdinov, and R. S. Zemel, "Unifying visual-semantic embeddings with multimodal neural language models," CoRR, vol. abs/1411.2539, 2014.Google Scholar
R. Socher, A. Karpathy, Q. V. Le, C. D. Manning, and A. Y. Ng, "Grounded compositional semantics for finding and describing images with sentences," TACL, vol. 2, pp. 207--218, 2014.Google ScholarCross Ref
J. Mao, W. Xu, Y. Yang, J. Wang, and A. L. Yuille, "Explain images with multimodal recurrent neural networks," CoRR, vol. abs/1410.1090, 2014.Google Scholar
S. Kottur, R. Vedantam, J. M. F. Moura, and D. Parikh, "Visual word2vec (vis-w2v): Learning visually grounded word embeddings using abstract scenes," CoRR, vol. abs/1511.07067, 2015.Google Scholar
T. Schnabel, I. Labutov, D. Mimno, and T. Joachims, "Evaluation methods for unsupervised word embeddings," in Conference on Empirical Methods in Natural Language Processing, (Lisbon, Portugal), pp. 298--307, 2015.Google Scholar
O. Levy, Y. Goldberg, and I. Dagan, "Improving distributional similarity with lessons learned from word embeddings," Transactions of Association for Computational Linguistics, vol. 3, pp. 211--225, 2015.Google ScholarCross Ref
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," in Advances in Neural Information Processing Systems 26, pp. 3111--3119, 2013.Google ScholarDigital Library
R. Lebret and R. Collobert, "Word embeddings through hellinger pca," in Conference of the European Chapter of the Association for Computational Linguistics, (Gothenburg, Sweden), pp. 482--490, 2014.Google Scholar
M. Baroni and R. Zamparelli, "Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space," in Conference on Empirical Methods in Natural Language Processing, (Cambridge, MA, USA), pp. 1183--1193, 2010. Google ScholarDigital Library
R. Socher, B. Huval, C. D. Manning, and A. Y. Ng, "Semantic compositionality through recursive matrix-vector spaces," in Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, (Jeju Island, Korea), pp. 1201--1211, 2012. Google ScholarDigital Library
H. Schmid, "Probabilistic part-of-speech tagging using decision trees," in International Conference on New Methods in Language Processing, (Manchester, UK), 1994.Google Scholar
W. A. Freiwald and D. Y. Tsao, "Neurons that keep a straight face," National Academy of Sciences, vol. 111, no. 22, pp. 7894--7895, 2014.Google ScholarCross Ref
M. Redi, N. Rasiwasia, G. Aggarwal, and A. Jaimes, "The beauty of capturing faces: Rating the quality of digital portraits," in IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, (Ljubljana, Slovenia), pp. 1--8, 2015.Google Scholar
B. Jou, S. Bhattacharya, and S.-F. Chang, "Predicting viewer perceived emotions in animated GIFs," in ACM International Conference on Multimedia, (Orlando, Florida, USA), pp. 213--216, 2014. Google ScholarDigital Library
S. Bakhshi, D. A. Shamma, and E. Gilbert, "Faces engage us: Photos with faces attract more likes and comments on instagram," in ACM Conference on Human Factors in Computing Systems, (Toronto, ON, Canada), pp. 965--974, 2014. Google ScholarDigital Library
S. Liao, A. K. Jain, and S. Z. Li, "A fast and accurate unconstrained face detector," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, pp. 211--223, Feb 2016. Google ScholarDigital Library

Index Terms

Multilingual Visual Sentiment Concept Matching
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing theory, concepts and paradigms
      1. Social content sharing
      2. Social tagging
  2. Human computer interaction (HCI)
    1. Empirical studies in HCI
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Sentiment analysis
    2. Specialized information retrieval
      1. Multimedia and multimodal retrieval
  2. Information systems applications
    1. Multimedia information systems
      1. Multimedia databases

Recommendations

SentiCart: Cartography and Geo-contextualization for Multilingual Visual Sentiment
ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval

Where in the world are pictures of cute animals or ancient architecture most shared from? And are they equally sentimentally perceived across different languages? We demonstrate a series of visualization tools, that we collectively call SentiCart, for ...
Read More
Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology
MM '15: Proceedings of the 23rd ACM international conference on Multimedia

Every culture and language is unique. Our work expressly focuses on the uniqueness of culture and language in relation to human affect, specifically sentiment and emotion semantics, and how they manifest in social multimedia. We develop sets of ...
Read More
SentiBank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content
MM '13: Proceedings of the 21st ACM international conference on Multimedia

A picture is worth one thousand words, but what words should be used to describe the sentiment and emotions conveyed in the increasingly popular social multimedia? We demonstrate a novel system which combines sound structures from psychology and the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval
June 2016
452 pages
ISBN:9781450343596
DOI:10.1145/2911996
General Chairs:
John R. Kender
Columbia University, USA
,
John R. Smith
IBM Research, USA
,
Program Chairs:
Jiebo Luo
University of Rochester, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Winston Hsu
National Taiwan University, Taiwan
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 June 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Best Multimodal paper
Author Tags
concept detection
cross-cultural
cultures
emotion
language
multilingual
ontology
sentiment
social multimedia
Qualifiers
- research-article
Conference

Acceptance Rates
ICMR '16 Paper Acceptance Rate20of120submissions,17%Overall Acceptance Rate254of830submissions,31%
More
Upcoming Conference
ICMR '24

Sponsor:

sigmm

International Conference on Multimedia Retrieval

June 10 - 14, 2024

Phuket , Thailand
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 202
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Multilingual Visual Sentiment Concept Matching

ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

SentiCart: Cartography and Geo-contextualization for Multilingual Visual Sentiment

Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology

SentiBank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content