Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2016

01.12.2016 | Original Article

From classification to quantification in tweet sentiment analysis

verfasst von: Wei Gao, Fabrizio Sebastiani

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Sentiment classification has become a ubiquitous enabling technology in the Twittersphere, since classifying tweets according to the sentiment they convey towards a given entity (be it a product, a person, a political party, or a policy) has many applications in political science, social science, market research, and many others. In this paper, we contend that most previous studies dealing with tweet sentiment classification (TSC) use a suboptimal approach. The reason is that the final goal of most such studies is not estimating the class label (e.g., Positive, Negative, or Neutral) of individual tweets, but estimating the relative frequency (a.k.a. “prevalence”) of the different classes in the dataset. The latter task is called quantification, and recent research has convincingly shown that it should be tackled as a task of its own, using learning algorithms and evaluation measures different from those used for classification. In this paper, we show (by carrying out experiments using two learners, seven quantification-specific algorithms, and 11 TSC datasets) that using quantification-specific algorithms produces substantially better class frequency estimates than a state-of-the-art classification-oriented algorithm routinely used in TSC. We thus argue that researchers interested in tweet sentiment prevalence should switch to quantification-specific (instead of classification-specific) learning algorithms and evaluation measures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Consistent with most mathematical literature, we use the caret symbol (\(\wedge\)) to indicate estimation.
 
2
Since the standard logistic function \(\frac{e^{x}}{e^{x}+1}\) ranges (for the domain \([0,+\infty )\) we are interested in) on [\(\frac{1}{2}\),1], we multiply by 2 in order for it to range on [1,2], and subtract 1 in order for it to range on [0,1], as desired.
 
4
In Joachims (2005), SVM-perf is actually called SVM-multi, but the author has released its implementation under the name SVM-perf; we will thus use this latter name.
 
5
SVM-perf is available from http://​svmlight.​joachims.​org/​svm_​struct.​html, while the module that customizes it to \({{\mathrm{KLD}}}\) is available from http://​hlt.​isti.​cnr.​it/​quantification/​. The code for all the other methods discussed in this section is available from http://​alt.​qcri.​org/​~wgao/​codes/​tweet_​sentiment_​quantification.​zip.​
 
6
This means that we avoid STC datasets in which the labels are automatically derived from, say, the emoticons present in the tweets.
 
7
In order to enhance the reproducibility of our experimental results, we make available (at http://​alt.​qcri.​org/​~wgao/​data/​SNAM/​tweet_​sentiment_​quantification.​zip) the vectorial representations we have generated for all the datasets (split into training / validation / test sets) used in this paper.
 
8
The SVM-based implementation of CC is called SVM(HL) in Gao and Sebastiani (2015). LIBSVM is available from http://​www.​csie.​ntu.​edu.​tw/​~cjlin/​libsvm/​.
 
9
At the time of writing this paper, the test set of the SemEval2016 collection has not yet been made available. However, the data made available by the organizers were already pre-split into three subsets, called “train”, “dev”, and “devtest”; we have thus used these subsets as the training set, held-out set, and test set, respectively.
 
Literatur
Zurück zum Zitat Alaíz-Rodríguez R, Guerrero-Curieses A, Cid-Sueiro J (2011) Class and subclass probability re-estimation to adapt a classifier in the presence of concept drift. Neurocomputing 74(16):2614–2623CrossRef Alaíz-Rodríguez R, Guerrero-Curieses A, Cid-Sueiro J (2011) Class and subclass probability re-estimation to adapt a classifier in the presence of concept drift. Neurocomputing 74(16):2614–2623CrossRef
Zurück zum Zitat Asur S, Huberman BA (2010) Predicting the future with social media. In: Proceedings of the 10th IEEE/WIC/ACM international conference on web intelligence (WI 2010), pp 492–499, Toronto, CA Asur S, Huberman BA (2010) Predicting the future with social media. In: Proceedings of the 10th IEEE/WIC/ACM international conference on web intelligence (WI 2010), pp 492–499, Toronto, CA
Zurück zum Zitat Balikas G, Partalas I, Gaussier E, Babbar R, Amini M-R (2015) Efficient model selection for regularized classification by exploiting unlabeled data. In: Proceedings of the 14th international symposium on intelligent data analysis (IDA 2015), pp 25–36, Saint Etienne, FR Balikas G, Partalas I, Gaussier E, Babbar R, Amini M-R (2015) Efficient model selection for regularized classification by exploiting unlabeled data. In: Proceedings of the 14th international symposium on intelligent data analysis (IDA 2015), pp 25–36, Saint Etienne, FR
Zurück zum Zitat Barranquero J, González P, Díez J, del Coz JJ (2013) On the study of nearest neighbor algorithms for prevalence estimation in binary problems. Pattern Recognit 46(2):472–482CrossRefMATH Barranquero J, González P, Díez J, del Coz JJ (2013) On the study of nearest neighbor algorithms for prevalence estimation in binary problems. Pattern Recognit 46(2):472–482CrossRefMATH
Zurück zum Zitat Barranquero J, Díez J, del Coz JJ (2015) Quantification-oriented learning based on reliable classifiers. Pattern Recognit 48(2):591–604CrossRef Barranquero J, Díez J, del Coz JJ (2015) Quantification-oriented learning based on reliable classifiers. Pattern Recognit 48(2):591–604CrossRef
Zurück zum Zitat Beijbom O, Hoffman J, Yao E, Darrell T, Rodriguez-Ramirez A, Gonzalez-Rivero M, Hoegh-Guldberg O (2015) Quantification in-the-wild: Data-sets and baselines. Presented at the NIPS 2015 Workshop on Transfer and Multi-Task Learning. Montreal, CA Beijbom O, Hoffman J, Yao E, Darrell T, Rodriguez-Ramirez A, Gonzalez-Rivero M, Hoegh-Guldberg O (2015) Quantification in-the-wild: Data-sets and baselines. Presented at the NIPS 2015 Workshop on Transfer and Multi-Task Learning. Montreal, CA
Zurück zum Zitat Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2010) Quantification via probability estimators. In: Proceedings of the 11th IEEE international conference on data mining (ICDM 2010), pp 737–742, Sydney, AU Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2010) Quantification via probability estimators. In: Proceedings of the 11th IEEE international conference on data mining (ICDM 2010), pp 737–742, Sydney, AU
Zurück zum Zitat Berardi G, Esuli A, Sebastiani F (2015) Utility-theoretic ranking for semi-automated text classification. ACM Trans Knowl Discov Data 10(1). Article 6 Berardi G, Esuli A, Sebastiani F (2015) Utility-theoretic ranking for semi-automated text classification. ACM Trans Knowl Discov Data 10(1). Article 6
Zurück zum Zitat Bollen J, Mao H, Zeng X-J (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8CrossRef Bollen J, Mao H, Zeng X-J (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8CrossRef
Zurück zum Zitat Borge-Holthoefer J, Magdy W, Darwish K, Weber I (2015) Content and network dynamics behind Egyptian political polarization on Twitter. In: Proceedings of the 18th ACM conference on computer supported cooperative work and social computing (CSCW 2015), pp 700–711, Vancouver, CA Borge-Holthoefer J, Magdy W, Darwish K, Weber I (2015) Content and network dynamics behind Egyptian political polarization on Twitter. In: Proceedings of the 18th ACM conference on computer supported cooperative work and social computing (CSCW 2015), pp 700–711, Vancouver, CA
Zurück zum Zitat Burton S, Soboleva A (2011) Interactive or reactive? Marketing with Twitter. J Consumer Mark 28(7):491–499CrossRef Burton S, Soboleva A (2011) Interactive or reactive? Marketing with Twitter. J Consumer Mark 28(7):491–499CrossRef
Zurück zum Zitat Chan YS, Ng HT (2006) Estimating class priors in domain adaptation for word sense disambiguation. In: Proceedings of the 44th annual meeting of the Association for Computational Linguistics (ACL 2006), pp 89–96, Sydney, AU Chan YS, Ng HT (2006) Estimating class priors in domain adaptation for word sense disambiguation. In: Proceedings of the 44th annual meeting of the Association for Computational Linguistics (ACL 2006), pp 89–96, Sydney, AU
Zurück zum Zitat Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3). Article 27 Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3). Article 27
Zurück zum Zitat Conroy BR, Sajda P (212) Fast, exact model selection and permutation testing for L2-regularized logistic regression. In: Proceedings of the 15th international conference on artificial intelligence and statistics (AISTATS 2012), pp 246–254, La Palma, ES Conroy BR, Sajda P (212) Fast, exact model selection and permutation testing for L2-regularized logistic regression. In: Proceedings of the 15th international conference on artificial intelligence and statistics (AISTATS 2012), pp 246–254, La Palma, ES
Zurück zum Zitat Csiszár I, Shields PC (2004) Information theory and statistics: a tutorial. Found Trends Commun Inf Theory 1(4):417–528CrossRefMATH Csiszár I, Shields PC (2004) Information theory and statistics: a tutorial. Found Trends Commun Inf Theory 1(4):417–528CrossRefMATH
Zurück zum Zitat Da San Martino G, Gao W, Sebastiani F (2016) QCRI at SemEval-2016 Task 4: probabilistic methods for binary and ordinal quantification. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval 2016), San Diego, US (Forthcoming) Da San Martino G, Gao W, Sebastiani F (2016) QCRI at SemEval-2016 Task 4: probabilistic methods for binary and ordinal quantification. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval 2016), San Diego, US (Forthcoming)
Zurück zum Zitat Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39(1):1–38MathSciNetMATH Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39(1):1–38MathSciNetMATH
Zurück zum Zitat Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH
Zurück zum Zitat Dodds PS, Harris KD, Kloumann IM, Bliss CA, Danforth CM (2011) Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter. PLoS One 6(12):e26752CrossRef Dodds PS, Harris KD, Kloumann IM, Bliss CA, Danforth CM (2011) Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter. PLoS One 6(12):e26752CrossRef
Zurück zum Zitat Esuli A (2016) ISTI-CNR at SemEval-2016 Task 4: quantification on an ordinal scale. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval 2016), San Diego, US Esuli A (2016) ISTI-CNR at SemEval-2016 Task 4: quantification on an ordinal scale. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval 2016), San Diego, US
Zurück zum Zitat Esuli A, Sebastiani F (2010) Sentiment quantification. IEEE Intell Syst 25(4):72–75CrossRef Esuli A, Sebastiani F (2010) Sentiment quantification. IEEE Intell Syst 25(4):72–75CrossRef
Zurück zum Zitat Esuli A, Sebastiani F (2014) Explicit loss minimization in quantification applications (preliminary draft). In: Presented at the 8th international workshop on information filtering and retrieval (DART 2014), Pisa, IT Esuli A, Sebastiani F (2014) Explicit loss minimization in quantification applications (preliminary draft). In: Presented at the 8th international workshop on information filtering and retrieval (DART 2014), Pisa, IT
Zurück zum Zitat Esuli A, Sebastiani F (2015) Optimizing text quantifiers for multivariate loss functions. ACM Trans Knowl Discov Data 9(4). Article 27 Esuli A, Sebastiani F (2015) Optimizing text quantifiers for multivariate loss functions. ACM Trans Knowl Discov Data 9(4). Article 27
Zurück zum Zitat Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874MATH Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874MATH
Zurück zum Zitat Forman G (2005) Counting positives accurately despite inaccurate classification. In: Proceedings of the 16th European Conference on machine learning (ECML 2005), pp 564–575, Porto, PT Forman G (2005) Counting positives accurately despite inaccurate classification. In: Proceedings of the 16th European Conference on machine learning (ECML 2005), pp 564–575, Porto, PT
Zurück zum Zitat Gao W, Sebastiani F (2015) Tweet sentiment: from classification to quantification. In: Proceedings of the 7th international conference on advances in social network analysis and mining (ASONAM 2015), pp 97–104, Paris, FR Gao W, Sebastiani F (2015) Tweet sentiment: from classification to quantification. In: Proceedings of the 7th international conference on advances in social network analysis and mining (ASONAM 2015), pp 97–104, Paris, FR
Zurück zum Zitat González-Castro V, Alaiz-Rodríguez R, Alegre E (2013) Class distribution estimation based on the Hellinger distance. Inf Sci 218:146–164CrossRef González-Castro V, Alaiz-Rodríguez R, Alegre E (2013) Class distribution estimation based on the Hellinger distance. Inf Sci 218:146–164CrossRef
Zurück zum Zitat Herfort B, Schelhorn S-J, de Albuquerque JP, Zipf A (2014) Does the spatiotemporal distribution of tweets match the spatiotemporal distribution of flood phenomena? A study about the river Elbe flood in June 2013. In: Proceedings of the 11th international conference on information systems for crisis response and management (ISCRAM 2014), pp 747–751, Philadelphia, US Herfort B, Schelhorn S-J, de Albuquerque JP, Zipf A (2014) Does the spatiotemporal distribution of tweets match the spatiotemporal distribution of flood phenomena? A study about the river Elbe flood in June 2013. In: Proceedings of the 11th international conference on information systems for crisis response and management (ISCRAM 2014), pp 747–751, Philadelphia, US
Zurück zum Zitat Hopkins DJ, King G (2010) A method of automated nonparametric content analysis for social science. Am J Political Sci 54(1):229–247CrossRef Hopkins DJ, King G (2010) A method of automated nonparametric content analysis for social science. Am J Political Sci 54(1):229–247CrossRef
Zurück zum Zitat Joachims T (2005) A support vector method for multivariate performance measures. In: Proceedings of the 22nd international conference on machine learning (ICML 2005), pp 377–384, Bonn, DE Joachims T (2005) A support vector method for multivariate performance measures. In: Proceedings of the 22nd international conference on machine learning (ICML 2005), pp 377–384, Bonn, DE
Zurück zum Zitat Joachims T, Hofmann T, Yue Y, Yu C-N (2009) Predicting structured objects with support vector machines. Commun ACM 52(11):97–104CrossRef Joachims T, Hofmann T, Yue Y, Yu C-N (2009) Predicting structured objects with support vector machines. Commun ACM 52(11):97–104CrossRef
Zurück zum Zitat Kaya M, Fidan G, Toroslu IH (2013) Transfer learning using Twitter data for improving sentiment classification of Turkish political news. In: Proceedings of the 28th international symposium on computer and information sciences (ISCIS 2013), pp 139–148, Paris, FR Kaya M, Fidan G, Toroslu IH (2013) Transfer learning using Twitter data for improving sentiment classification of Turkish political news. In: Proceedings of the 28th international symposium on computer and information sciences (ISCIS 2013), pp 139–148, Paris, FR
Zurück zum Zitat Kiritchenko S, Zhu X, Mohammad SM (2014) Sentiment analysis of short informal texts. J Artif Intell Res 50:723–762MATH Kiritchenko S, Zhu X, Mohammad SM (2014) Sentiment analysis of short informal texts. J Artif Intell Res 50:723–762MATH
Zurück zum Zitat Latinne P, Saerens M, Decaestecker C (2001) Adjusting the outputs of a classifier to new a priori probabilities may significantly improve classification accuracy: evidence from a multi-class problem in remote sensing. In: Proceedings of the 18th international conference on machine learning (ICML 2001), pp 298–305 Latinne P, Saerens M, Decaestecker C (2001) Adjusting the outputs of a classifier to new a priori probabilities may significantly improve classification accuracy: evidence from a multi-class problem in remote sensing. In: Proceedings of the 18th international conference on machine learning (ICML 2001), pp 298–305
Zurück zum Zitat Lewis DD (1995) Evaluating and optimizing autonomous text classification systems. In: Proceedings of the 18th ACM international conference on research and development in information retrieval (SIGIR 1995), pp 246–254, Seattle, US Lewis DD (1995) Evaluating and optimizing autonomous text classification systems. In: Proceedings of the 18th ACM international conference on research and development in information retrieval (SIGIR 1995), pp 246–254, Seattle, US
Zurück zum Zitat Limsetto N, Waiyamai K (2011) Handling concept drift via ensemble and class distribution estimation technique. In: Proceedings of the 7th international conference on advanced data mining (ADMA 2011), pp 13–26, Beijing, CN Limsetto N, Waiyamai K (2011) Handling concept drift via ensemble and class distribution estimation technique. In: Proceedings of the 7th international conference on advanced data mining (ADMA 2011), pp 13–26, Beijing, CN
Zurück zum Zitat Marchetti-Bowick M, Chambers N (2012) Learning for microblogs with distant supervision: political forecasting with Twitter. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), pp 603–612, Avignon, FR Marchetti-Bowick M, Chambers N (2012) Learning for microblogs with distant supervision: political forecasting with Twitter. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), pp 603–612, Avignon, FR
Zurück zum Zitat Martínez-Cámara E, Martín-Valdivia MT, López LAU, Ráez AM (2014) Sentiment analysis in Twitter. Nat Lang Eng 20(1):1–28CrossRef Martínez-Cámara E, Martín-Valdivia MT, López LAU, Ráez AM (2014) Sentiment analysis in Twitter. Nat Lang Eng 20(1):1–28CrossRef
Zurück zum Zitat Mejova Y, Weber I, Macy MW (eds) (2015) Twitter: a digital socioscope. Cambridge University Press, Cambridge Mejova Y, Weber I, Macy MW (eds) (2015) Twitter: a digital socioscope. Cambridge University Press, Cambridge
Zurück zum Zitat Milli L, Monreale A, Rossetti G, Giannotti F, Pedreschi D, Sebastiani F (2013) Quantification trees. In: Proceedings of the 13th IEEE international conference on data mining (ICDM 2013), pp 528–536, Dallas, US Milli L, Monreale A, Rossetti G, Giannotti F, Pedreschi D, Sebastiani F (2013) Quantification trees. In: Proceedings of the 13th IEEE international conference on data mining (ICDM 2013), pp 528–536, Dallas, US
Zurück zum Zitat Mohammad SM, Kiritchenko S, Zhu X (2013) NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. In: Proceedings of the 7th international workshop on semantic evaluation (SemEval 2013), pp 321–327, Atlanta, US Mohammad SM, Kiritchenko S, Zhu X (2013) NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. In: Proceedings of the 7th international workshop on semantic evaluation (SemEval 2013), pp 321–327, Atlanta, US
Zurück zum Zitat Murphy KP (2012) Machine learning. A probabilistic perspective. The MIT Press, CambridgeMATH Murphy KP (2012) Machine learning. A probabilistic perspective. The MIT Press, CambridgeMATH
Zurück zum Zitat Nakov P, Rosenthal S, Kozareva Z, Stoyanov V, Ritter A, Wilson T (2013) SemEval-2013 Task 2: sentiment analysis in Twitter. In: Proceedings of the 7th international workshop on semantic evaluation (SemEval 2013), pp 312–320, Atlanta, US Nakov P, Rosenthal S, Kozareva Z, Stoyanov V, Ritter A, Wilson T (2013) SemEval-2013 Task 2: sentiment analysis in Twitter. In: Proceedings of the 7th international workshop on semantic evaluation (SemEval 2013), pp 312–320, Atlanta, US
Zurück zum Zitat Nakov P, Ritter A, Rosenthal S, Sebastiani F, Stoyanov V (2016) SemEval-2016 Task 4: sentiment analysis in Twitter. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval 2016), San Diego, US (forthcoming) Nakov P, Ritter A, Rosenthal S, Sebastiani F, Stoyanov V (2016) SemEval-2016 Task 4: sentiment analysis in Twitter. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval 2016), San Diego, US (forthcoming)
Zurück zum Zitat Narasimhan H, Li S, Kar P, Chawla S, Sebastiani F (2016) Stochastic optimization techniques for quantification performance measures. Submitted for publication Narasimhan H, Li S, Kar P, Chawla S, Sebastiani F (2016) Stochastic optimization techniques for quantification performance measures. Submitted for publication
Zurück zum Zitat O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the 4th AAAI Conference on Weblogs and Social Media (ICWSM 2010), Washington, US O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the 4th AAAI Conference on Weblogs and Social Media (ICWSM 2010), Washington, US
Zurück zum Zitat Olteanu A, Vieweg S, Castillo C (2015) What to expect when the unexpected happens: social media communications across crises. In: Proceedings of the 18th ACM conference on computer supported cooperative work and social computing (CSCW 2015), pp 994–1009, Vancouver, CA Olteanu A, Vieweg S, Castillo C (2015) What to expect when the unexpected happens: social media communications across crises. In: Proceedings of the 18th ACM conference on computer supported cooperative work and social computing (CSCW 2015), pp 994–1009, Vancouver, CA
Zurück zum Zitat Pan W, Zhong E, Yang Q (2012) Transfer learning for text mining. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, Heidelberg, pp 223–258CrossRef Pan W, Zhong E, Yang Q (2012) Transfer learning for text mining. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, Heidelberg, pp 223–258CrossRef
Zurück zum Zitat Qureshi MA, O’Riordan C, Pasi G (2013) Clustering with error estimation for monitoring reputation of companies on Twitter. In: Proceedings of the 9th Asia Information Retrieval Societies Conference (AIRS 2013), pp 170–180. Singapore, SN Qureshi MA, O’Riordan C, Pasi G (2013) Clustering with error estimation for monitoring reputation of companies on Twitter. In: Proceedings of the 9th Asia Information Retrieval Societies Conference (AIRS 2013), pp 170–180. Singapore, SN
Zurück zum Zitat Rosenthal S, Nakov P, Kiritchenko S, Mohammad S, Ritter A, Stoyanov V (2015) SemEval-2015 Task 10: sentiment analysis in Twitter. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pp 451–463, Denver, US Rosenthal S, Nakov P, Kiritchenko S, Mohammad S, Ritter A, Stoyanov V (2015) SemEval-2015 Task 10: sentiment analysis in Twitter. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pp 451–463, Denver, US
Zurück zum Zitat Rosenthal S, Ritter A, Nakov P, Stoyanov V (2014) SemEval-2014 Task 9: sentiment analysis in Twitter. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 73–80, Dublin, IE Rosenthal S, Ritter A, Nakov P, Stoyanov V (2014) SemEval-2014 Task 9: sentiment analysis in Twitter. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 73–80, Dublin, IE
Zurück zum Zitat Saerens M, Latinne P, Decaestecker C (2002) Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Comput 14(1):21–41CrossRefMATH Saerens M, Latinne P, Decaestecker C (2002) Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Comput 14(1):21–41CrossRefMATH
Zurück zum Zitat Saif H, Fernez M, He Y, Alani H (2013) Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold. In: Proceedings of the 1st international workshop on emotion and sentiment in social and expressive media (ESSEM 2013), pp 9–21, Torino, IT Saif H, Fernez M, He Y, Alani H (2013) Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold. In: Proceedings of the 1st international workshop on emotion and sentiment in social and expressive media (ESSEM 2013), pp 9–21, Torino, IT
Zurück zum Zitat Sánchez L, González V, Alegre E, Alaiz R (2008) Classification and quantification based on image analysis for sperm samples with uncertain damaged/intact cell proportions. In: Proceedings of the 5th international conference on image analysis and recognition (ICIAR 2008), pp 827–836, Póvoa de Varzim, PT Sánchez L, González V, Alegre E, Alaiz R (2008) Classification and quantification based on image analysis for sperm samples with uncertain damaged/intact cell proportions. In: Proceedings of the 5th international conference on image analysis and recognition (ICIAR 2008), pp 827–836, Póvoa de Varzim, PT
Zurück zum Zitat Takahashi T, Abe S, Igata N (2011) Can Twitter be an alternative of real-world sensors? In: Proceedings of the 14th international conference on human–computer interaction (HCI International 2011), pp 240–249, Orlando, US Takahashi T, Abe S, Igata N (2011) Can Twitter be an alternative of real-world sensors? In: Proceedings of the 14th international conference on human–computer interaction (HCI International 2011), pp 240–249, Orlando, US
Zurück zum Zitat Tang L, Gao H, Liu H (2010) Network quantification despite biased labels. In: Proceedings of the 8th workshop on mining and learning with graphs (MLG 2010), pp 147–154, Washington, US Tang L, Gao H, Liu H (2010) Network quantification despite biased labels. In: Proceedings of the 8th workshop on mining and learning with graphs (MLG 2010), pp 147–154, Washington, US
Zurück zum Zitat Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6:1453–1484MathSciNetMATH Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6:1453–1484MathSciNetMATH
Zurück zum Zitat Vapnik V (1998) Statistical learning theory. Wiley, New YorkMATH Vapnik V (1998) Statistical learning theory. Wiley, New YorkMATH
Zurück zum Zitat Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83CrossRef Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83CrossRef
Zurück zum Zitat Wu T-F, Lin C-J, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005MathSciNetMATH Wu T-F, Lin C-J, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005MathSciNetMATH
Zurück zum Zitat Xue JC, Weiss GM (2009) Quantification and semi-supervised classification methods for handling changes in class distribution. In: Proceedings of the 15th ACM international conference on knowledge discovery and data mining (SIGKDD 2009), pp 897–906, Paris, FR Xue JC, Weiss GM (2009) Quantification and semi-supervised classification methods for handling changes in class distribution. In: Proceedings of the 15th ACM international conference on knowledge discovery and data mining (SIGKDD 2009), pp 897–906, Paris, FR
Zurück zum Zitat Zhang Z, Zhou J (2010) Transfer estimation of evolving class priors in data stream classification. Pattern Recognit 43(9):3151–3161CrossRefMATH Zhang Z, Zhou J (2010) Transfer estimation of evolving class priors in data stream classification. Pattern Recognit 43(9):3151–3161CrossRefMATH
Zurück zum Zitat Zhu X, Kiritchenko S, Mohammad SM (2014) NRC-Canada-2014: recent improvements in the sentiment analysis of tweets. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 443–447, Dublin, IE Zhu X, Kiritchenko S, Mohammad SM (2014) NRC-Canada-2014: recent improvements in the sentiment analysis of tweets. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 443–447, Dublin, IE
Zurück zum Zitat Zou F, Wang Y, Yang Y, Zhou K, Chen Y, Song J (2015) Supervised feature learning via L2-norm regularized logistic regression for 3D object recognition. Neurocomputing 151:603–611CrossRef Zou F, Wang Y, Yang Y, Zhou K, Chen Y, Song J (2015) Supervised feature learning via L2-norm regularized logistic regression for 3D object recognition. Neurocomputing 151:603–611CrossRef
Metadaten
Titel
From classification to quantification in tweet sentiment analysis
verfasst von
Wei Gao
Fabrizio Sebastiani
Publikationsdatum
01.12.2016
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2016
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-016-0327-z

Weitere Artikel der Ausgabe 1/2016

Social Network Analysis and Mining 1/2016 Zur Ausgabe