Skip to main content
Top
Published in:

01-12-2016 | Original Article

Unsupervised graph-based pattern extraction for multilingual emotion classification

Authors: Elvis Saravia, Carlos Argueta, Yi-Shin Chen

Published in: Social Network Analysis and Mining | Issue 1/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The connected society we live in today has allowed online users to willingly share opinions on an unprecedented scale. Motivated by the advent of mass opinion sharing, it is then crucial to devise algorithms that efficiently identify the emotions expressed within the opinionated content. Traditional opinion-based classifiers require extracting high-dimensional feature representations, which become computationally expensive to process and can misrepresent or deteriorate the accuracy of a classifier. In this paper, we propose an unsupervised graph-based approach for extracting Twitter-specific emotion-bearing patterns to be used as features. By utilizing a more representative list of patterns, as features, we improved the precision and recall of a given emotion classification task. Due to its novel bootstrapping process, the full system is also adaptable to different domains and languages. The experimented results demonstrate that the extracted patterns are effective in identifying emotions for English, Spanish, and French Twitter streams. We also provide detailed experiments and offer an extended version of our algorithm to support the classification of Indonesian microblog posts. Overall, our empirical experimented results demonstrate that the proposed approach bears desirable characteristics such as accuracy, generality, adaptability, minimal supervision, and coverage.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
Predefined dictionaries require time and human effort to construct, and they may not be suitable for certain domains and languages. Moreover, directly translating them to other languages may not be effective or warrant high accuracy.
 
3
By emotion-related terms, we are referring to connector words and subject words that are relevant to emotion.
 
4
Considering that the languages we are studying are syntactic, where word order carries most or all of the meaning, graph analysis plays an important role in preserving the structure and meaning of text.
 
5
Other external sources or dictionaries become quickly outdated since they might not cover or contain such recent and commonly used informal words.
 
6
Unigrams require extra computational effort and may at times misrepresent a text.
 
7
We used around 5 query hashtags, as noisy labels, to collect the social data related to each emotion category. For example, for the emotion “sadness” we queried for tweets that contain the hashtag “#sadness” and other emotion-related hashtags.
 
Literature
go back to reference Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26(3):12:1–12:34CrossRef Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26(3):12:1–12:34CrossRef
go back to reference Arifin AZ, Sari YA, Ratnasari EK, Mutrofin S (2014) Emotion detection of tweets in indonesian language using non-negative matrix factorization. Int J Intell Syst Appl (IJISA) 6(9):54 Arifin AZ, Sari YA, Ratnasari EK, Mutrofin S (2014) Emotion detection of tweets in indonesian language using non-negative matrix factorization. Int J Intell Syst Appl (IJISA) 6(9):54
go back to reference Balahur A, Turchi M (2012) Comparative experiments for multilingual sentiment analysis using machine translation. In: Proceedings of the 1st workshop on sentiment discovery from affective data (ECML-PKDD 2012), SDAD ’12, pp 75 Balahur A, Turchi M (2012) Comparative experiments for multilingual sentiment analysis using machine translation. In: Proceedings of the 1st workshop on sentiment discovery from affective data (ECML-PKDD 2012), SDAD ’12, pp 75
go back to reference Banea C, Mihalcea R, Wiebe J (2010) Multilingual subjectivity: Are more languages better? In: Proceedings of the 23rd international conference on computational linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, COLING ’10, pp 28–36 Banea C, Mihalcea R, Wiebe J (2010) Multilingual subjectivity: Are more languages better? In: Proceedings of the 23rd international conference on computational linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, COLING ’10, pp 28–36
go back to reference Banea C, Mihalcea R, Wiebe J (2013) Porting multilingual subjectivity resources across languages. IEEE Trans Affect Comput 99(PrePrints):1 Banea C, Mihalcea R, Wiebe J (2013) Porting multilingual subjectivity resources across languages. IEEE Trans Affect Comput 99(PrePrints):1
go back to reference Bermingham A, Smeaton AF (2010) Classifying sentiment in microblogs: Is brevity an advantage? Association for Computing Machinery, p 1833 Bermingham A, Smeaton AF (2010) Classifying sentiment in microblogs: Is brevity an advantage? Association for Computing Machinery, p 1833
go back to reference Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boomboxes and blenders: domain adaptation for sentiment classification. In: ACL, pp 187–205 Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boomboxes and blenders: domain adaptation for sentiment classification. In: ACL, pp 187–205
go back to reference Bollen J, Mao H, Zeng XJ (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8CrossRef Bollen J, Mao H, Zeng XJ (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8CrossRef
go back to reference Calderon F, Chang CH, Argueta C, Saravia E, Chen YS (2015) Analyzing event opinion transition through summarized emotion visualization. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, NY, USA, ASONAM ’15, pp 749–752 Calderon F, Chang CH, Argueta C, Saravia E, Chen YS (2015) Analyzing event opinion transition through summarized emotion visualization. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, NY, USA, ASONAM ’15, pp 749–752
go back to reference Cui A, Zhang M, Liu Y, Ma S (2011) Emotion tokens: bridging the gap among multilingual twitter sentiment analysis. In: Proceedings of the 7th Asia conference on information retrieval technology, pp 238–249 Cui A, Zhang M, Liu Y, Ma S (2011) Emotion tokens: bridging the gap among multilingual twitter sentiment analysis. In: Proceedings of the 7th Asia conference on information retrieval technology, pp 238–249
go back to reference Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: WWW ’03: Proceedings of the 12th international conference on World Wide Web, pp 519–528 Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: WWW ’03: Proceedings of the 12th international conference on World Wide Web, pp 519–528
go back to reference Davidov D, Tsur O, Rappoport A (2010) Enhanced sentiment learning using twitter hashtags and smileys. In: Proceedings of the 23rd international conference on computational linguistics: posters, pp 241–249 Davidov D, Tsur O, Rappoport A (2010) Enhanced sentiment learning using twitter hashtags and smileys. In: Proceedings of the 23rd international conference on computational linguistics: posters, pp 241–249
go back to reference Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford 1:12 Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford 1:12
go back to reference Hu Y, Duan J, Chen X, Pei B, Lu R (2005) A new method for sentiment classification in text retrieval. In: Natural Language Processing–IJCNLP 2005, Springer, pp 1–9 Hu Y, Duan J, Chen X, Pei B, Lu R (2005) A new method for sentiment classification in text retrieval. In: Natural Language Processing–IJCNLP 2005, Springer, pp 1–9
go back to reference Jansen BJ, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inf Sci 60(11):2169–2188CrossRef Jansen BJ, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inf Sci 60(11):2169–2188CrossRef
go back to reference Kanayama H, Nasukawa T (2006) Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: Proceedings of the 2006 conference on empirical methods in natural language processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’06, pp 355–363 Kanayama H, Nasukawa T (2006) Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: Proceedings of the 2006 conference on empirical methods in natural language processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’06, pp 355–363
go back to reference Kim E, Gilbert S, Edwards M, Graeff E (2009) Detecting sadness in 140 characters. Webecology Project Kim E, Gilbert S, Edwards M, Graeff E (2009) Detecting sadness in 140 characters. Webecology Project
go back to reference Kobayashi N, Inui K, Matsumoto Y, Tateishi K, Fukushima T (2005) Collecting evaluative expressions for opinion extraction. In: Natural Language Processing–IJCNLP 2004, Springer, pp 596–605 Kobayashi N, Inui K, Matsumoto Y, Tateishi K, Fukushima T (2005) Collecting evaluative expressions for opinion extraction. In: Natural Language Processing–IJCNLP 2004, Springer, pp 596–605
go back to reference Kouloumpis E, Wilson T, Moore J (2011) Twitter sentiment analysis: the good the bad and the omg!. Icwsm 11:538–541 Kouloumpis E, Wilson T, Moore J (2011) Twitter sentiment analysis: the good the bad and the omg!. Icwsm 11:538–541
go back to reference Kucuktunc O, Cambazoglu BB, Weber I, Ferhatosmanoglu H (2012) A large-scale sentiment analysis for yahoo! answers. In: Proceedings of the fifth ACM international conference on web search and data mining, ACM, New York, NY, USA, WSDM ’12, pp 633–642 Kucuktunc O, Cambazoglu BB, Weber I, Ferhatosmanoglu H (2012) A large-scale sentiment analysis for yahoo! answers. In: Proceedings of the fifth ACM international conference on web search and data mining, ACM, New York, NY, USA, WSDM ’12, pp 633–642
go back to reference Lin Y, Lei H, Wu J, Li X (2015) An empirical study on sentiment classification of chinese review using word embedding. Citeseer, pp 258–266 Lin Y, Lei H, Wu J, Li X (2015) An empirical study on sentiment classification of chinese review using word embedding. Citeseer, pp 258–266
go back to reference Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167CrossRef Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167CrossRef
go back to reference Liu KL, Li WJ, Guo M (2012) Emoticon smoothed language models for twitter sentiment analysis. In: AAAI, pp 1678–1684 Liu KL, Li WJ, Guo M (2012) Emoticon smoothed language models for twitter sentiment analysis. In: AAAI, pp 1678–1684
go back to reference Narr S, Hulfenhaus M, Albayrak S (2012) Language-independent twitter sentiment analysis. In: Workshop on knowledge discovery, data mining and machine learning, pp 12–14 Narr S, Hulfenhaus M, Albayrak S (2012) Language-independent twitter sentiment analysis. In: Workshop on knowledge discovery, data mining and machine learning, pp 12–14
go back to reference Nasukawa T, Yi J (2003) Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd international conference on knowledge capture, ACM, New York, NY, USA, K-CAP ’03, pp 70–77 Nasukawa T, Yi J (2003) Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd international conference on knowledge capture, ACM, New York, NY, USA, K-CAP ’03, pp 70–77
go back to reference Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. LREC 10:1320–1326 Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. LREC 10:1320–1326
go back to reference Pandey V, Iyer CVK (2010) Sentiment analysis of microblogs. Citeseer Pandey V, Iyer CVK (2010) Sentiment analysis of microblogs. Citeseer
go back to reference Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef
go back to reference Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10, Association for Computational Linguistics, pp 79–86 Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10, Association for Computational Linguistics, pp 79–86
go back to reference Pell M, Rothermich K, Liu P, Paulmann S, Sethi S, Rigoulot S (2015) Preferential decoding of emotion from human non-linguistic vocalizations versus speech prosody. Biol Psychol 111:14–25CrossRef Pell M, Rothermich K, Liu P, Paulmann S, Sethi S, Rigoulot S (2015) Preferential decoding of emotion from human non-linguistic vocalizations versus speech prosody. Biol Psychol 111:14–25CrossRef
go back to reference Pennebaker JW, Chung CK, Ireland M, Gonzales A, Booth RJ (2007) The development and psychometric properties of LIWC2007 Pennebaker JW, Chung CK, Ireland M, Gonzales A, Booth RJ (2007) The development and psychometric properties of LIWC2007
go back to reference Purnama KE (2012) Classification of emotions in indonesian textsusing K-NN method. Int J Inf Electron Eng 2(6):899 Purnama KE (2012) Classification of emotions in indonesian textsusing K-NN method. Int J Inf Electron Eng 2(6):899
go back to reference Qadir A, Riloff E (2013) Bootstrapped learning of emotion hashtags# hashtags4you. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 2–11 Qadir A, Riloff E (2013) Bootstrapped learning of emotion hashtags# hashtags4you. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 2–11
go back to reference Riloff E, Wiebe J (2003) Learning extraction patterns for subjective expressions. In: Proceedings of the 2003 conference on empirical methods in natural language processing (EMNLP-03), pp 105–112 Riloff E, Wiebe J (2003) Learning extraction patterns for subjective expressions. In: Proceedings of the 2003 conference on empirical methods in natural language processing (EMNLP-03), pp 105–112
go back to reference Saravia E, Argueta C, Chen YS (2015) Emoviz: Mining the world’s interest through emotion analysis. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, NY, USA, ASONAM ’15, pp 753–756 Saravia E, Argueta C, Chen YS (2015) Emoviz: Mining the world’s interest through emotion analysis. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, NY, USA, ASONAM ’15, pp 753–756
go back to reference Singh V, Piryani R, Uddin A, Waila P (2013) Sentiment analysis of movie reviews and blog posts. In: 2013 IEEE 3rd international on advance computing conference (IACC), IEEE, pp 893–898 Singh V, Piryani R, Uddin A, Waila P (2013) Sentiment analysis of movie reviews and blog posts. In: 2013 IEEE 3rd international on advance computing conference (IACC), IEEE, pp 893–898
go back to reference Sun Y, Quan C, Kang X, Zhang Z, Ren F (2015) Customer emotion detection by emotion expression analysis on adverbs. Inf Technol Manag 16:1–9CrossRef Sun Y, Quan C, Kang X, Zhang Z, Ren F (2015) Customer emotion detection by emotion expression analysis on adverbs. Inf Technol Manag 16:1–9CrossRef
go back to reference Takamura H, Inui T, Okumura M (2006) Latent variable models for semantic orientations of phrases. In: EACL, pp 201–208 Takamura H, Inui T, Okumura M (2006) Latent variable models for semantic orientations of phrases. In: EACL, pp 201–208
go back to reference Tokuhisa R, Inui K, Matsumoto Y (2008) Emotion classification using massive examples extracted from the web. In: Proceedings of the 22nd international conference on computational linguistics-vol 1, Association for Computational Linguistics, pp 881–888 Tokuhisa R, Inui K, Matsumoto Y (2008) Emotion classification using massive examples extracted from the web. In: Proceedings of the 22nd international conference on computational linguistics-vol 1, Association for Computational Linguistics, pp 881–888
go back to reference Tromp E, Pechenizkiy M (2015) Pattern-based emotion classification on social media. In: Gaber MM, Cocea M, Wiratunga N, Goker A (eds) Advances in social media analysis. Springer, Berlin, pp 1–20CrossRef Tromp E, Pechenizkiy M (2015) Pattern-based emotion classification on social media. In: Gaber MM, Cocea M, Wiratunga N, Goker A (eds) Advances in social media analysis. Springer, Berlin, pp 1–20CrossRef
go back to reference Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. ICWSM 10:178–185 Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. ICWSM 10:178–185
go back to reference Volvoka S, Wilson T, Yarowski D (2013) Exploring sentiment in social media: Bootstrapping subjectivity clues from multilingual twitter streams. In: Proceedings of the 51st annual meeting of the association for computational linguistics, Association for Computational Linguistics, Baltimore, MD, USA, ACLShort ’13, pp 505–510 Volvoka S, Wilson T, Yarowski D (2013) Exploring sentiment in social media: Bootstrapping subjectivity clues from multilingual twitter streams. In: Proceedings of the 51st annual meeting of the association for computational linguistics, Association for Computational Linguistics, Baltimore, MD, USA, ACLShort ’13, pp 505–510
go back to reference Wei B, Pal C (2010) Cross lingual adaptation: an experiment on sentiment classifications. In: Proceedings of the ACL 2010 conference short papers, Association for Computational Linguistics, Stroudsburg, PA, USA, ACLShort ’10, pp 258–262 Wei B, Pal C (2010) Cross lingual adaptation: an experiment on sentiment classifications. In: Proceedings of the ACL 2010 conference short papers, Association for Computational Linguistics, Stroudsburg, PA, USA, ACLShort ’10, pp 258–262
go back to reference Wicaksono AF, Vania C, Bayu Distiawan T, Adriani M (2014) Automatically building a corpus for sentiment analysis on indonesian tweets. In: Proceedings of the 28th Pacific Asia conference on language, information and computation, pp 185–194 Wicaksono AF, Vania C, Bayu Distiawan T, Adriani M (2014) Automatically building a corpus for sentiment analysis on indonesian tweets. In: Proceedings of the 28th Pacific Asia conference on language, information and computation, pp 185–194
go back to reference Wijaya V, Erwin A, Galinium M, Muliady W (2013) Automatic mood classification of indonesian tweets using linguistic approach. In: 2013 International conference on information technology and electrical engineering (ICITEE), IEEE, pp 41–46 Wijaya V, Erwin A, Galinium M, Muliady W (2013) Automatic mood classification of indonesian tweets using linguistic approach. In: 2013 International conference on information technology and electrical engineering (ICITEE), IEEE, pp 41–46
go back to reference Xu H, Yang W, Wang J (2015) Hierarchical emotion classification and emotion component analysis on Chinese micro-blog posts. Expert Syst Appl 42(22):8745–8752CrossRef Xu H, Yang W, Wang J (2015) Hierarchical emotion classification and emotion component analysis on Chinese micro-blog posts. Expert Syst Appl 42(22):8745–8752CrossRef
go back to reference Yessenalina A, Yue Y, Cardie C (2010) Multi-level structured models for document-level sentiment classification. In: Proceedings of the 2010 conference on empirical methods in natural language processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’10, pp 1046–1056 Yessenalina A, Yue Y, Cardie C (2010) Multi-level structured models for document-level sentiment classification. In: Proceedings of the 2010 conference on empirical methods in natural language processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’10, pp 1046–1056
go back to reference Zhai Z, Xu H, Kang B, Jia P (2011) Exploiting effective features for Chinese sentiment classification. Expert Syst Appl 38(8):9139–9146CrossRef Zhai Z, Xu H, Kang B, Jia P (2011) Exploiting effective features for Chinese sentiment classification. Expert Syst Appl 38(8):9139–9146CrossRef
Metadata
Title
Unsupervised graph-based pattern extraction for multilingual emotion classification
Authors
Elvis Saravia
Carlos Argueta
Yi-Shin Chen
Publication date
01-12-2016
Publisher
Springer Vienna
Published in
Social Network Analysis and Mining / Issue 1/2016
Print ISSN: 1869-5450
Electronic ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-016-0403-4

Premium Partner