Skip to main content
Erschienen in: Artificial Intelligence Review 4/2017

02.07.2016

A comparative empirical study on social media sentiment analysis over various genres and languages

verfasst von: Viktor Hangya, Richárd Farkas

Erschienen in: Artificial Intelligence Review | Ausgabe 4/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

People express their opinions about things like products, celebrities and services using social media channels. The analysis of these textual contents for sentiments is a gold mine for marketing experts as well as for research in humanities, thus automatic sentiment analysis is a popular area of applied artificial intelligence. The chief objective of this paper is to investigate automatic sentiment analysis on social media contents over various text sources and languages. The comparative findings of the investigation may give useful insights to artificial intelligence researchers who develop sentiment analyzers for a new textual source. To achieve this, we describe supervised machine learning based systems which perform sentiment analysis and we comparatively evaluate them on seven publicly available English and Hungarian databases, which contain text documents taken from Twitter and product review sites. We discuss the differences among these text genres and languages in terms of document- and target-level sentiment analysis.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Amigó E, Carrillo de Albornoz J, Chugur I, Corujo A, Gonzalo J, Martín T, Meij E, de Rijke M, Spina D, Amigo E, de Albornoz JC, Martin T, de Rijke M (2013) Overview of replab 2013: evaluating online reputation monitoring systems. In: Information access evaluation. multilinguality, multimodality, and visualization, pp 333–352 Amigó E, Carrillo de Albornoz J, Chugur I, Corujo A, Gonzalo J, Martín T, Meij E, de Rijke M, Spina D, Amigo E, de Albornoz JC, Martin T, de Rijke M (2013) Overview of replab 2013: evaluating online reputation monitoring systems. In: Information access evaluation. multilinguality, multimodality, and visualization, pp 333–352
Zurück zum Zitat Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the seventh international conference on language resources and evaluation (LREC’10) Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the seventh international conference on language resources and evaluation (LREC’10)
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH
Zurück zum Zitat Bohnet B (2010) Top accuracy and fast dependency parsing is not a contradiction. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010), Beijing, China, pp 89–97 Bohnet B (2010) Top accuracy and fast dependency parsing is not a contradiction. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010), Beijing, China, pp 89–97
Zurück zum Zitat Ceylan H, Mihalcea R (2011) An efficient indexer for large N-gram corpora. In: ACL (system demonstrations), pp 103–108 Ceylan H, Mihalcea R (2011) An efficient indexer for large N-gram corpora. In: ACL (system demonstrations), pp 103–108
Zurück zum Zitat Cossu JV, Bigot B, Bonnefoy L, Morchid M, Bost X, Senay G, Dufour R, Bouvier V, Torres-Moreno JM, El-Beze M (2013) LIA@RepLab 2013. In: Working notes of CLEF 2013 evaluation labs and workshop Cossu JV, Bigot B, Bonnefoy L, Morchid M, Bost X, Senay G, Dufour R, Bouvier V, Torres-Moreno JM, El-Beze M (2013) LIA@RepLab 2013. In: Working notes of CLEF 2013 evaluation labs and workshop
Zurück zum Zitat Farkas R, Bohnet B (2012) Stacking of dependency and phrase structure parsers. In: Proceedings of COLING 2012, the COLING 2012 Organizing Committee, Mumbai, pp 849–866 Farkas R, Bohnet B (2012) Stacking of dependency and phrase structure parsers. In: Proceedings of COLING 2012, the COLING 2012 Organizing Committee, Mumbai, pp 849–866
Zurück zum Zitat Foster J, Çetinoglu Ö, Wagner J, Le Roux J, Hogan S, Nivre J, Hogan D, Van Genabith J (2011) # hardtoparse: POS tagging and parsing the twitterverse. In: AAAI 2011 workshop on analyzing microtext, pp 20–25 Foster J, Çetinoglu Ö, Wagner J, Le Roux J, Hogan S, Nivre J, Hogan D, Van Genabith J (2011) # hardtoparse: POS tagging and parsing the twitterverse. In: AAAI 2011 workshop on analyzing microtext, pp 20–25
Zurück zum Zitat Hangya V, Farkas R (2013) Filtering and polarity detection for reputation management on tweets. In: Working notes of CLEF 2013 evaluation labs and workshop Hangya V, Farkas R (2013) Filtering and polarity detection for reputation management on tweets. In: Working notes of CLEF 2013 evaluation labs and workshop
Zurück zum Zitat Hangya V, Berend G, Farkas R (2013) SZTE-NLP: sentiment detection on twitter messages. In: Second joint conference on lexical and computational semantics (*SEM), volume 2: proceedings of the seventh international workshop on semantic evaluation (SemEval 2013), pp 549–553 Hangya V, Berend G, Farkas R (2013) SZTE-NLP: sentiment detection on twitter messages. In: Second joint conference on lexical and computational semantics (*SEM), volume 2: proceedings of the seventh international workshop on semantic evaluation (SemEval 2013), pp 549–553
Zurück zum Zitat Hangya V, Berend G, Varga I, Farkas R (2014) SZTE-NLP: aspect level opinion mining exploiting syntactic cues. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014). Dublin, Ireland, pp 610–614 Hangya V, Berend G, Varga I, Farkas R (2014) SZTE-NLP: aspect level opinion mining exploiting syntactic cues. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014). Dublin, Ireland, pp 610–614
Zurück zum Zitat Jansen BJ, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inf Sci Technol 60(11):2169–2188 Jansen BJ, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inf Sci Technol 60(11):2169–2188
Zurück zum Zitat Jiang L, Yu M, Zhou M, Liu X, Zhao T (2011) Target-dependent Twitter sentiment classification. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics, pp 151–160 Jiang L, Yu M, Zhou M, Liu X, Zhao T (2011) Target-dependent Twitter sentiment classification. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics, pp 151–160
Zurück zum Zitat Jindal N, Liu B, Street SM (2008) Opinion spam and analysis. In: Proceedings of the 2008 international conference on web search and data mining Jindal N, Liu B, Street SM (2008) Opinion spam and analysis. In: Proceedings of the 2008 international conference on web search and data mining
Zurück zum Zitat Kessler JS, Eckert M, Clark L, Nicolov N (2010) The 2010 ICWSM JDPA sentiment corpus for the automotive domain. In: 4th international AAAI conference on weblogs and social media data workshop challenge (ICWSM-DWC 2010) Kessler JS, Eckert M, Clark L, Nicolov N (2010) The 2010 ICWSM JDPA sentiment corpus for the automotive domain. In: 4th international AAAI conference on weblogs and social media data workshop challenge (ICWSM-DWC 2010)
Zurück zum Zitat Kiritchenko S, Zhu X, Cherry C, Mohammad S (2014) NRC-Canada-2014: detecting aspects and sentiment in customer reviews. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), SemEval, p 437 Kiritchenko S, Zhu X, Cherry C, Mohammad S (2014) NRC-Canada-2014: detecting aspects and sentiment in customer reviews. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), SemEval, p 437
Zurück zum Zitat Kong L, Schneider N, Swayamdipta S, Bhatia A, Dyer C, Smith NA (2014) A dependency parser for tweets. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 1001–1012 Kong L, Schneider N, Swayamdipta S, Bhatia A, Dyer C, Smith NA (2014) A dependency parser for tweets. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 1001–1012
Zurück zum Zitat Lazaridou A, Titov II, Sporleder CC (2013) A Bayesian model for joint unsupervised induction of sentiment, aspect and discourse representations. In: 51st annual meeting of the Association for Computational Linguistics, ACL 2013, pp 1630–1639 Lazaridou A, Titov II, Sporleder CC (2013) A Bayesian model for joint unsupervised induction of sentiment, aspect and discourse representations. In: 51st annual meeting of the Association for Computational Linguistics, ACL 2013, pp 1630–1639
Zurück zum Zitat Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167CrossRef Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167CrossRef
Zurück zum Zitat Miháltz M (2013) OpinHuBank: szabadon hozzáférhető annotált korpusz magyar nyelvű véleményelemzéshez. In: IX. Magyar Számítógépes Nyelvészeti Konferencia, pp 343–345 Miháltz M (2013) OpinHuBank: szabadon hozzáférhető annotált korpusz magyar nyelvű véleményelemzéshez. In: IX. Magyar Számítógépes Nyelvészeti Konferencia, pp 343–345
Zurück zum Zitat Montejo-Ráez A, Martínez-Cámara E, Martín-Valdivia MT, Ureña-López LA (2014) A knowledge-based approach for polarity classification in Twitter. J Assoc Inf Sci Technol 65(2):414–425. doi:10.1002/asi.22984 CrossRef Montejo-Ráez A, Martínez-Cámara E, Martín-Valdivia MT, Ureña-López LA (2014) A knowledge-based approach for polarity classification in Twitter. J Assoc Inf Sci Technol 65(2):414–425. doi:10.​1002/​asi.​22984 CrossRef
Zurück zum Zitat O’Connor B, Balasubramanyan R (2010) From tweets to polls: linking text sentiment to public opinion time series. In: ICWSM O’Connor B, Balasubramanyan R (2010) From tweets to polls: linking text sentiment to public opinion time series. In: ICWSM
Zurück zum Zitat Pontiki M, Galanis D, Pavlopoulos J, Papageorgiou H, Androutsopoulos I, Manandhar S (2014) Semeval-2014 task 4: aspect based sentiment analysis. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), SemEval ’14, pp 27–35 Pontiki M, Galanis D, Pavlopoulos J, Papageorgiou H, Androutsopoulos I, Manandhar S (2014) Semeval-2014 task 4: aspect based sentiment analysis. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), SemEval ’14, pp 27–35
Zurück zum Zitat Poria S, Cambria E, Ku LW, Gui C, Gelbukh A (2014) A rule-based approach to aspect extraction from product reviews. In: Proceedings of the second workshop on natural language processing for social media (SocialNLP), Association for Computational Linguistics and Dublin City University, Dublin, pp 28–37 Poria S, Cambria E, Ku LW, Gui C, Gelbukh A (2014) A rule-based approach to aspect extraction from product reviews. In: Proceedings of the second workshop on natural language processing for social media (SocialNLP), Association for Computational Linguistics and Dublin City University, Dublin, pp 28–37
Zurück zum Zitat Rosenthal S, Nakov P, Ritter A, Stoyanov V (2014) Semeval-2014 task 9: Sentiment analysis in twitter. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), SemEval, pp 73–80 Rosenthal S, Nakov P, Ritter A, Stoyanov V (2014) Semeval-2014 task 9: Sentiment analysis in twitter. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), SemEval, pp 73–80
Zurück zum Zitat Sang ETK, Bos J (2012) Predicting the 2011 Dutch Senate Election results with Twitter. In: Proceedings of the 13th conference of the European Chapter of the Association for Computational Linguistics, pp 53–60 Sang ETK, Bos J (2012) Predicting the 2011 Dutch Senate Election results with Twitter. In: Proceedings of the 13th conference of the European Chapter of the Association for Computational Linguistics, pp 53–60
Zurück zum Zitat Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642 Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642
Zurück zum Zitat Szántó Zs, Farkas R (2014) Special techniques for constituent parsing of morphologically rich languages. In: Proceedings of the 14th conference of the European Chapter of the Association for Computational Linguistics, pp 135–144 Szántó Zs, Farkas R (2014) Special techniques for constituent parsing of morphologically rich languages. In: Proceedings of the 14th conference of the European Chapter of the Association for Computational Linguistics, pp 135–144
Zurück zum Zitat Varga I, Sano M, Torisawa K, Hashimoto C, Ohtake K, Kawai T, Oh JH, De Saeger S (2013) Aid is out there: looking for help from tweets during a large scale disaster. In: Proceedings of the 51st annual meeting of the ACL, pp 1619–1629 Varga I, Sano M, Torisawa K, Hashimoto C, Ohtake K, Kawai T, Oh JH, De Saeger S (2013) Aid is out there: looking for help from tweets during a large scale disaster. In: Proceedings of the 51st annual meeting of the ACL, pp 1619–1629
Zurück zum Zitat Vilares D, Alonso MA, Gómez-Rodríguez C (2015b) On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages. J Assoc Inf Sci Technol 66(9):1799–1816. doi:10.1002/asi.23284 CrossRef Vilares D, Alonso MA, Gómez-Rodríguez C (2015b) On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages. J Assoc Inf Sci Technol 66(9):1799–1816. doi:10.​1002/​asi.​23284 CrossRef
Zurück zum Zitat Vinodhini G, Chandrasekaran RM (2012) Sentiment analysis and opinion mining: a survey. Int J Adv Res Comput Sci Softw Eng 2(6):282–292 Vinodhini G, Chandrasekaran RM (2012) Sentiment analysis and opinion mining: a survey. Int J Adv Res Comput Sci Softw Eng 2(6):282–292
Zurück zum Zitat Wagner J, Arora P, Cortes S (2014) DCU: aspect-based polarity classification for semeval task 4. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 223–229 Wagner J, Arora P, Cortes S (2014) DCU: aspect-based polarity classification for semeval task 4. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 223–229
Zurück zum Zitat Wiegand M, Balahur A, Roth B, Klakow D, Montoyo A (2010) A survey on the role of negation in sentiment analysis. In: Proceedings of the workshop on negation and speculation in natural language processing, pp 60–68 Wiegand M, Balahur A, Roth B, Klakow D, Montoyo A (2010) A survey on the role of negation in sentiment analysis. In: Proceedings of the workshop on negation and speculation in natural language processing, pp 60–68
Zurück zum Zitat Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the conference on human language technology and empirical methods in natural language processing, Association for Computational Linguistics, pp 347–354 Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the conference on human language technology and empirical methods in natural language processing, Association for Computational Linguistics, pp 347–354
Zurück zum Zitat Wilson T, Kozareva Z, Nakov P, Rosenthal S, Stoyanov V, Ritter A (2013) SemEval-2013 Task 2: sentiment analysis in Twitter. In: Proceedings of the international workshop on semantic evaluation, SemEval‘3 Wilson T, Kozareva Z, Nakov P, Rosenthal S, Stoyanov V, Ritter A (2013) SemEval-2013 Task 2: sentiment analysis in Twitter. In: Proceedings of the international workshop on semantic evaluation, SemEval‘3
Zurück zum Zitat Zhang C, Zeng D, Li J, Wang FY, Zuo W (2009) Sentiment analysis of Chinese documents: from sentence to document level. J Am Soc Inf Sci Technol 60(12):2474–2487. doi:10.1002/asi.21206 CrossRef Zhang C, Zeng D, Li J, Wang FY, Zuo W (2009) Sentiment analysis of Chinese documents: from sentence to document level. J Am Soc Inf Sci Technol 60(12):2474–2487. doi:10.​1002/​asi.​21206 CrossRef
Zurück zum Zitat Zhu X, Kiritchenko S, Mohammad S (2014) NRC-Canada-2014: recent improvements in the sentiment analysis of tweets. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 443–447 Zhu X, Kiritchenko S, Mohammad S (2014) NRC-Canada-2014: recent improvements in the sentiment analysis of tweets. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 443–447
Zurück zum Zitat Zsibrita J, Vincze V, Farkas R (2013) Magyarlanc: a toolkit for morphological and dependency parsing of Hungarian. In: Proceedings of RANLP, pp 763–771 Zsibrita J, Vincze V, Farkas R (2013) Magyarlanc: a toolkit for morphological and dependency parsing of Hungarian. In: Proceedings of RANLP, pp 763–771
Metadaten
Titel
A comparative empirical study on social media sentiment analysis over various genres and languages
verfasst von
Viktor Hangya
Richárd Farkas
Publikationsdatum
02.07.2016
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence Review / Ausgabe 4/2017
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-016-9489-3

Weitere Artikel der Ausgabe 4/2017

Artificial Intelligence Review 4/2017 Zur Ausgabe