Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2016

01.12.2016 | Original Article

Unsupervised graph-based pattern extraction for multilingual emotion classification

verfasst von: Elvis Saravia, Carlos Argueta, Yi-Shin Chen

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The connected society we live in today has allowed online users to willingly share opinions on an unprecedented scale. Motivated by the advent of mass opinion sharing, it is then crucial to devise algorithms that efficiently identify the emotions expressed within the opinionated content. Traditional opinion-based classifiers require extracting high-dimensional feature representations, which become computationally expensive to process and can misrepresent or deteriorate the accuracy of a classifier. In this paper, we propose an unsupervised graph-based approach for extracting Twitter-specific emotion-bearing patterns to be used as features. By utilizing a more representative list of patterns, as features, we improved the precision and recall of a given emotion classification task. Due to its novel bootstrapping process, the full system is also adaptable to different domains and languages. The experimented results demonstrate that the extracted patterns are effective in identifying emotions for English, Spanish, and French Twitter streams. We also provide detailed experiments and offer an extended version of our algorithm to support the classification of Indonesian microblog posts. Overall, our empirical experimented results demonstrate that the proposed approach bears desirable characteristics such as accuracy, generality, adaptability, minimal supervision, and coverage.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Predefined dictionaries require time and human effort to construct, and they may not be suitable for certain domains and languages. Moreover, directly translating them to other languages may not be effective or warrant high accuracy.
 
3
By emotion-related terms, we are referring to connector words and subject words that are relevant to emotion.
 
4
Considering that the languages we are studying are syntactic, where word order carries most or all of the meaning, graph analysis plays an important role in preserving the structure and meaning of text.
 
5
Other external sources or dictionaries become quickly outdated since they might not cover or contain such recent and commonly used informal words.
 
6
Unigrams require extra computational effort and may at times misrepresent a text.
 
7
We used around 5 query hashtags, as noisy labels, to collect the social data related to each emotion category. For example, for the emotion “sadness” we queried for tweets that contain the hashtag “#sadness” and other emotion-related hashtags.
 
Literatur
Zurück zum Zitat Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26(3):12:1–12:34CrossRef Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26(3):12:1–12:34CrossRef
Zurück zum Zitat Arifin AZ, Sari YA, Ratnasari EK, Mutrofin S (2014) Emotion detection of tweets in indonesian language using non-negative matrix factorization. Int J Intell Syst Appl (IJISA) 6(9):54 Arifin AZ, Sari YA, Ratnasari EK, Mutrofin S (2014) Emotion detection of tweets in indonesian language using non-negative matrix factorization. Int J Intell Syst Appl (IJISA) 6(9):54
Zurück zum Zitat Balahur A, Turchi M (2012) Comparative experiments for multilingual sentiment analysis using machine translation. In: Proceedings of the 1st workshop on sentiment discovery from affective data (ECML-PKDD 2012), SDAD ’12, pp 75 Balahur A, Turchi M (2012) Comparative experiments for multilingual sentiment analysis using machine translation. In: Proceedings of the 1st workshop on sentiment discovery from affective data (ECML-PKDD 2012), SDAD ’12, pp 75
Zurück zum Zitat Banea C, Mihalcea R, Wiebe J (2010) Multilingual subjectivity: Are more languages better? In: Proceedings of the 23rd international conference on computational linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, COLING ’10, pp 28–36 Banea C, Mihalcea R, Wiebe J (2010) Multilingual subjectivity: Are more languages better? In: Proceedings of the 23rd international conference on computational linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, COLING ’10, pp 28–36
Zurück zum Zitat Banea C, Mihalcea R, Wiebe J (2013) Porting multilingual subjectivity resources across languages. IEEE Trans Affect Comput 99(PrePrints):1 Banea C, Mihalcea R, Wiebe J (2013) Porting multilingual subjectivity resources across languages. IEEE Trans Affect Comput 99(PrePrints):1
Zurück zum Zitat Bermingham A, Smeaton AF (2010) Classifying sentiment in microblogs: Is brevity an advantage? Association for Computing Machinery, p 1833 Bermingham A, Smeaton AF (2010) Classifying sentiment in microblogs: Is brevity an advantage? Association for Computing Machinery, p 1833
Zurück zum Zitat Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boomboxes and blenders: domain adaptation for sentiment classification. In: ACL, pp 187–205 Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boomboxes and blenders: domain adaptation for sentiment classification. In: ACL, pp 187–205
Zurück zum Zitat Bollen J, Mao H, Zeng XJ (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8CrossRef Bollen J, Mao H, Zeng XJ (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8CrossRef
Zurück zum Zitat Calderon F, Chang CH, Argueta C, Saravia E, Chen YS (2015) Analyzing event opinion transition through summarized emotion visualization. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, NY, USA, ASONAM ’15, pp 749–752 Calderon F, Chang CH, Argueta C, Saravia E, Chen YS (2015) Analyzing event opinion transition through summarized emotion visualization. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, NY, USA, ASONAM ’15, pp 749–752
Zurück zum Zitat Cui A, Zhang M, Liu Y, Ma S (2011) Emotion tokens: bridging the gap among multilingual twitter sentiment analysis. In: Proceedings of the 7th Asia conference on information retrieval technology, pp 238–249 Cui A, Zhang M, Liu Y, Ma S (2011) Emotion tokens: bridging the gap among multilingual twitter sentiment analysis. In: Proceedings of the 7th Asia conference on information retrieval technology, pp 238–249
Zurück zum Zitat Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: WWW ’03: Proceedings of the 12th international conference on World Wide Web, pp 519–528 Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: WWW ’03: Proceedings of the 12th international conference on World Wide Web, pp 519–528
Zurück zum Zitat Davidov D, Tsur O, Rappoport A (2010) Enhanced sentiment learning using twitter hashtags and smileys. In: Proceedings of the 23rd international conference on computational linguistics: posters, pp 241–249 Davidov D, Tsur O, Rappoport A (2010) Enhanced sentiment learning using twitter hashtags and smileys. In: Proceedings of the 23rd international conference on computational linguistics: posters, pp 241–249
Zurück zum Zitat Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford 1:12 Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford 1:12
Zurück zum Zitat Hu Y, Duan J, Chen X, Pei B, Lu R (2005) A new method for sentiment classification in text retrieval. In: Natural Language Processing–IJCNLP 2005, Springer, pp 1–9 Hu Y, Duan J, Chen X, Pei B, Lu R (2005) A new method for sentiment classification in text retrieval. In: Natural Language Processing–IJCNLP 2005, Springer, pp 1–9
Zurück zum Zitat Jansen BJ, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inf Sci 60(11):2169–2188CrossRef Jansen BJ, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inf Sci 60(11):2169–2188CrossRef
Zurück zum Zitat Kanayama H, Nasukawa T (2006) Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: Proceedings of the 2006 conference on empirical methods in natural language processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’06, pp 355–363 Kanayama H, Nasukawa T (2006) Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: Proceedings of the 2006 conference on empirical methods in natural language processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’06, pp 355–363
Zurück zum Zitat Kim E, Gilbert S, Edwards M, Graeff E (2009) Detecting sadness in 140 characters. Webecology Project Kim E, Gilbert S, Edwards M, Graeff E (2009) Detecting sadness in 140 characters. Webecology Project
Zurück zum Zitat Kobayashi N, Inui K, Matsumoto Y, Tateishi K, Fukushima T (2005) Collecting evaluative expressions for opinion extraction. In: Natural Language Processing–IJCNLP 2004, Springer, pp 596–605 Kobayashi N, Inui K, Matsumoto Y, Tateishi K, Fukushima T (2005) Collecting evaluative expressions for opinion extraction. In: Natural Language Processing–IJCNLP 2004, Springer, pp 596–605
Zurück zum Zitat Kouloumpis E, Wilson T, Moore J (2011) Twitter sentiment analysis: the good the bad and the omg!. Icwsm 11:538–541 Kouloumpis E, Wilson T, Moore J (2011) Twitter sentiment analysis: the good the bad and the omg!. Icwsm 11:538–541
Zurück zum Zitat Kucuktunc O, Cambazoglu BB, Weber I, Ferhatosmanoglu H (2012) A large-scale sentiment analysis for yahoo! answers. In: Proceedings of the fifth ACM international conference on web search and data mining, ACM, New York, NY, USA, WSDM ’12, pp 633–642 Kucuktunc O, Cambazoglu BB, Weber I, Ferhatosmanoglu H (2012) A large-scale sentiment analysis for yahoo! answers. In: Proceedings of the fifth ACM international conference on web search and data mining, ACM, New York, NY, USA, WSDM ’12, pp 633–642
Zurück zum Zitat Lin Y, Lei H, Wu J, Li X (2015) An empirical study on sentiment classification of chinese review using word embedding. Citeseer, pp 258–266 Lin Y, Lei H, Wu J, Li X (2015) An empirical study on sentiment classification of chinese review using word embedding. Citeseer, pp 258–266
Zurück zum Zitat Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167CrossRef Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167CrossRef
Zurück zum Zitat Liu KL, Li WJ, Guo M (2012) Emoticon smoothed language models for twitter sentiment analysis. In: AAAI, pp 1678–1684 Liu KL, Li WJ, Guo M (2012) Emoticon smoothed language models for twitter sentiment analysis. In: AAAI, pp 1678–1684
Zurück zum Zitat Mohammad SM, Turney PD (2013) Crowdsourcing a word-emotion association lexicon. Comput Intell 29(3):436–465MathSciNetCrossRef Mohammad SM, Turney PD (2013) Crowdsourcing a word-emotion association lexicon. Comput Intell 29(3):436–465MathSciNetCrossRef
Zurück zum Zitat Narr S, Hulfenhaus M, Albayrak S (2012) Language-independent twitter sentiment analysis. In: Workshop on knowledge discovery, data mining and machine learning, pp 12–14 Narr S, Hulfenhaus M, Albayrak S (2012) Language-independent twitter sentiment analysis. In: Workshop on knowledge discovery, data mining and machine learning, pp 12–14
Zurück zum Zitat Nasukawa T, Yi J (2003) Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd international conference on knowledge capture, ACM, New York, NY, USA, K-CAP ’03, pp 70–77 Nasukawa T, Yi J (2003) Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd international conference on knowledge capture, ACM, New York, NY, USA, K-CAP ’03, pp 70–77
Zurück zum Zitat Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. LREC 10:1320–1326 Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. LREC 10:1320–1326
Zurück zum Zitat Pandey V, Iyer CVK (2010) Sentiment analysis of microblogs. Citeseer Pandey V, Iyer CVK (2010) Sentiment analysis of microblogs. Citeseer
Zurück zum Zitat Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef
Zurück zum Zitat Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10, Association for Computational Linguistics, pp 79–86 Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10, Association for Computational Linguistics, pp 79–86
Zurück zum Zitat Pell M, Rothermich K, Liu P, Paulmann S, Sethi S, Rigoulot S (2015) Preferential decoding of emotion from human non-linguistic vocalizations versus speech prosody. Biol Psychol 111:14–25CrossRef Pell M, Rothermich K, Liu P, Paulmann S, Sethi S, Rigoulot S (2015) Preferential decoding of emotion from human non-linguistic vocalizations versus speech prosody. Biol Psychol 111:14–25CrossRef
Zurück zum Zitat Pennebaker JW, Chung CK, Ireland M, Gonzales A, Booth RJ (2007) The development and psychometric properties of LIWC2007 Pennebaker JW, Chung CK, Ireland M, Gonzales A, Booth RJ (2007) The development and psychometric properties of LIWC2007
Zurück zum Zitat Purnama KE (2012) Classification of emotions in indonesian textsusing K-NN method. Int J Inf Electron Eng 2(6):899 Purnama KE (2012) Classification of emotions in indonesian textsusing K-NN method. Int J Inf Electron Eng 2(6):899
Zurück zum Zitat Qadir A, Riloff E (2013) Bootstrapped learning of emotion hashtags# hashtags4you. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 2–11 Qadir A, Riloff E (2013) Bootstrapped learning of emotion hashtags# hashtags4you. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 2–11
Zurück zum Zitat Riloff E, Wiebe J (2003) Learning extraction patterns for subjective expressions. In: Proceedings of the 2003 conference on empirical methods in natural language processing (EMNLP-03), pp 105–112 Riloff E, Wiebe J (2003) Learning extraction patterns for subjective expressions. In: Proceedings of the 2003 conference on empirical methods in natural language processing (EMNLP-03), pp 105–112
Zurück zum Zitat Saravia E, Argueta C, Chen YS (2015) Emoviz: Mining the world’s interest through emotion analysis. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, NY, USA, ASONAM ’15, pp 753–756 Saravia E, Argueta C, Chen YS (2015) Emoviz: Mining the world’s interest through emotion analysis. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, NY, USA, ASONAM ’15, pp 753–756
Zurück zum Zitat Singh V, Piryani R, Uddin A, Waila P (2013) Sentiment analysis of movie reviews and blog posts. In: 2013 IEEE 3rd international on advance computing conference (IACC), IEEE, pp 893–898 Singh V, Piryani R, Uddin A, Waila P (2013) Sentiment analysis of movie reviews and blog posts. In: 2013 IEEE 3rd international on advance computing conference (IACC), IEEE, pp 893–898
Zurück zum Zitat Sun Y, Quan C, Kang X, Zhang Z, Ren F (2015) Customer emotion detection by emotion expression analysis on adverbs. Inf Technol Manag 16:1–9CrossRef Sun Y, Quan C, Kang X, Zhang Z, Ren F (2015) Customer emotion detection by emotion expression analysis on adverbs. Inf Technol Manag 16:1–9CrossRef
Zurück zum Zitat Takamura H, Inui T, Okumura M (2006) Latent variable models for semantic orientations of phrases. In: EACL, pp 201–208 Takamura H, Inui T, Okumura M (2006) Latent variable models for semantic orientations of phrases. In: EACL, pp 201–208
Zurück zum Zitat Tokuhisa R, Inui K, Matsumoto Y (2008) Emotion classification using massive examples extracted from the web. In: Proceedings of the 22nd international conference on computational linguistics-vol 1, Association for Computational Linguistics, pp 881–888 Tokuhisa R, Inui K, Matsumoto Y (2008) Emotion classification using massive examples extracted from the web. In: Proceedings of the 22nd international conference on computational linguistics-vol 1, Association for Computational Linguistics, pp 881–888
Zurück zum Zitat Tromp E, Pechenizkiy M (2015) Pattern-based emotion classification on social media. In: Gaber MM, Cocea M, Wiratunga N, Goker A (eds) Advances in social media analysis. Springer, Berlin, pp 1–20CrossRef Tromp E, Pechenizkiy M (2015) Pattern-based emotion classification on social media. In: Gaber MM, Cocea M, Wiratunga N, Goker A (eds) Advances in social media analysis. Springer, Berlin, pp 1–20CrossRef
Zurück zum Zitat Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. ICWSM 10:178–185 Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. ICWSM 10:178–185
Zurück zum Zitat Volvoka S, Wilson T, Yarowski D (2013) Exploring sentiment in social media: Bootstrapping subjectivity clues from multilingual twitter streams. In: Proceedings of the 51st annual meeting of the association for computational linguistics, Association for Computational Linguistics, Baltimore, MD, USA, ACLShort ’13, pp 505–510 Volvoka S, Wilson T, Yarowski D (2013) Exploring sentiment in social media: Bootstrapping subjectivity clues from multilingual twitter streams. In: Proceedings of the 51st annual meeting of the association for computational linguistics, Association for Computational Linguistics, Baltimore, MD, USA, ACLShort ’13, pp 505–510
Zurück zum Zitat Wei B, Pal C (2010) Cross lingual adaptation: an experiment on sentiment classifications. In: Proceedings of the ACL 2010 conference short papers, Association for Computational Linguistics, Stroudsburg, PA, USA, ACLShort ’10, pp 258–262 Wei B, Pal C (2010) Cross lingual adaptation: an experiment on sentiment classifications. In: Proceedings of the ACL 2010 conference short papers, Association for Computational Linguistics, Stroudsburg, PA, USA, ACLShort ’10, pp 258–262
Zurück zum Zitat Wicaksono AF, Vania C, Bayu Distiawan T, Adriani M (2014) Automatically building a corpus for sentiment analysis on indonesian tweets. In: Proceedings of the 28th Pacific Asia conference on language, information and computation, pp 185–194 Wicaksono AF, Vania C, Bayu Distiawan T, Adriani M (2014) Automatically building a corpus for sentiment analysis on indonesian tweets. In: Proceedings of the 28th Pacific Asia conference on language, information and computation, pp 185–194
Zurück zum Zitat Wijaya V, Erwin A, Galinium M, Muliady W (2013) Automatic mood classification of indonesian tweets using linguistic approach. In: 2013 International conference on information technology and electrical engineering (ICITEE), IEEE, pp 41–46 Wijaya V, Erwin A, Galinium M, Muliady W (2013) Automatic mood classification of indonesian tweets using linguistic approach. In: 2013 International conference on information technology and electrical engineering (ICITEE), IEEE, pp 41–46
Zurück zum Zitat Xu H, Yang W, Wang J (2015) Hierarchical emotion classification and emotion component analysis on Chinese micro-blog posts. Expert Syst Appl 42(22):8745–8752CrossRef Xu H, Yang W, Wang J (2015) Hierarchical emotion classification and emotion component analysis on Chinese micro-blog posts. Expert Syst Appl 42(22):8745–8752CrossRef
Zurück zum Zitat Yessenalina A, Yue Y, Cardie C (2010) Multi-level structured models for document-level sentiment classification. In: Proceedings of the 2010 conference on empirical methods in natural language processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’10, pp 1046–1056 Yessenalina A, Yue Y, Cardie C (2010) Multi-level structured models for document-level sentiment classification. In: Proceedings of the 2010 conference on empirical methods in natural language processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’10, pp 1046–1056
Zurück zum Zitat Zhai Z, Xu H, Kang B, Jia P (2011) Exploiting effective features for Chinese sentiment classification. Expert Syst Appl 38(8):9139–9146CrossRef Zhai Z, Xu H, Kang B, Jia P (2011) Exploiting effective features for Chinese sentiment classification. Expert Syst Appl 38(8):9139–9146CrossRef
Metadaten
Titel
Unsupervised graph-based pattern extraction for multilingual emotion classification
verfasst von
Elvis Saravia
Carlos Argueta
Yi-Shin Chen
Publikationsdatum
01.12.2016
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2016
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-016-0403-4

Weitere Artikel der Ausgabe 1/2016

Social Network Analysis and Mining 1/2016 Zur Ausgabe