ABSTRACT
Sentiment lexica are useful for analyzing opinions in Web collections, for domain-dependent sentiment classification, and as sub-components of recommender systems. In this paper, we present a strategy for automatically generating topic-dependent lexica from large corpora of review articles by exploiting accompanying user ratings. Our approach combines text segmentation, discriminative feature analysis techniques, and latent topic extraction to infer the polarity of n-grams in a topical context. Our experiments on rating prediction demonstrate a substantial performance improvement in comparison with existing state-of-the-art sentiment lexica.
- A. Andreevskaia and S. Bergler. When specialists and generalists work together: Overcoming domain dependence in sentiment tagging. In Proc. of the 46th Annual Meeting of the Association for Computational Linguistics, pages 290--298. ACL, 2008.Google Scholar
- S. Baccianella, A. Esuli, and F. Sebastiani. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the International Conference on Language Resources and Evaluation, pages 2200--2204. ELRA, 2010.Google Scholar
- D. Blei and J. McAuliffe. Supervised topic models. In Advances in Neural Information Processing Systems, 2007.Google Scholar
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, 2003. Google ScholarCross Ref
- J. Blitzer, M. Dredze, and F. Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 440--447. ACL, 2007.Google Scholar
- J. Bross and H. Ehrig. Generating a context-aware sentiment lexicon for aspect-based product review mining. In Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pages 435--439. IEEE CS, 2010. Google ScholarDigital Library
- Y. Choi and C. Cardie. Adapting a polarity lexicon using integer linear programming for domain-specific sentiment classification. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 590--598. ACL, 2009. Google ScholarDigital Library
- Y. Choi, Y. Kim, and S.-H. Myaeng. Domain-specific sentiment analysis using contextual feature generation. In Proc. of the 1st Intl. CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pages 37--44. ACM, 2009. Google ScholarDigital Library
- Y. Dang, Y. Zhang, and H. Chen. A lexicon-enhanced method for sentiment classification: An experiment on online product reviews. IEEE Intelligent Systems, 25:46--53, 2010. Google ScholarDigital Library
- K. Denecke. Are sentiwordnet scores suited for multi-domain sentiment classification? In 4th IEEE International Conference on Digital Information Management, pages 33--38. IEEE, 2009.Google ScholarCross Ref
- A. Esuli and F. Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of the 5th Conference on Language Resources and Evaluation, pages 417--422, 2006.Google Scholar
- A. Fahrni and M. Klenner. Old wine or warm beer: Target-specific sentiment analysis of adjectives. In Proceedings of the Symposium on Affective Language in Human and Machine, pages 60--63, April 2008.Google Scholar
- C. Fellbaum. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.Google ScholarCross Ref
- S. Gindl, A. Weichselbraun, and A. Scharl. Cross-domain contextualisation of sentiment lexicons. In 19th European Conference on Artificial Intelligence, volume 215 of Frontiers in Artificial Intelligence and Applications, pages 771--776. IOS Press, 2010. Google ScholarDigital Library
- T. L. Griffiths and M. Steyvers. Finding scientific topics. Proc Natl Acad Sci U S A, 101 Suppl 1:5228--5235, 2004.Google ScholarCross Ref
- V. Jijkoun, M. de Rijke, and W. Weerkamp. Generating focused topic-specific sentiment lexicons. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 585--594. ACL, 2010. Google ScholarDigital Library
- Y. Jo and A. H. Oh. Aspect and sentiment unification model for online review analysis. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining, pages 815--824. ACM, 2011. Google ScholarDigital Library
- N. Kaji and M. Kitsuregawa. Building lexicon for sentiment analysis from massive collection of HTML documents. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning), pages 1075--1083. ACL, 2007.Google Scholar
- H. Kanayama and T. Nasukawa. Fully automatic lexicon expansion for domain-oriented sentiment analysis. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 355--363. ACL, 2006. Google ScholarDigital Library
- W. H. Kruskal. Ordinal measures of association. J. of the American Statistical Association, 53(284):814--861, 1958.Google ScholarCross Ref
- F. Li, M. Huang, and X. Zhu. Sentiment analysis with global topics and local dependency. In Proceedings of the 24th AAAI Conference on Artificial Intelligence. AAAI Press, 2010.Google Scholar
- F. Li, S. J. Pan, O. Jin, Q. Yang, and X. Zhu. Cross-domain co-extraction of sentiment and topic lexicons. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pages 410--419. ACL, 2012. Google ScholarDigital Library
- Y. Liu, X. Huang, A. An, and X. Yu. Arsa: A sentiment-aware model for predicting sales performance using blogs. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 607--614. ACM, 2007. Google ScholarDigital Library
- Y. Lu, M. Castellanos, U. Dayal, and C. Zhai. Automatic construction of a context-aware sentiment lexicon: an optimization approach. In Proceedings of the 20th International Conference on World Wide Web, pages 347--356. ACM, 2011. Google ScholarDigital Library
- Y. Lu, C. Zhai, and N. Sundaresan. Rated aspect summarization of short comments. In Proceedings of the 18th International Conference on World Wide Web, pages 131--140. ACM, 2009. Google ScholarDigital Library
- A. K. McCallum. Mallet: A machine learning for language toolkit, 2002. http://mallet.cs.umass.edu.Google Scholar
- S. Moghaddam and M. Ester. Ilda: interdependent lda model for learning latent aspects and their ratings from online product reviews. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information, pages 665--674. ACM, 2011. Google ScholarDigital Library
- S. Nowson. Scary films good, scary flights bad: Topic driven feature selection for classification of sentiment. In Proceeding of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, pages 17--24. ACM, 2009. Google ScholarDigital Library
- S. J. Pan, X. Ni, J.-T. Sun, Q. Yang, and Z. Chen. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on World Wide Web, pages 751--760. ACM, 2010. Google ScholarDigital Library
- B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, pages 79--86. ACL, 2002. Google ScholarDigital Library
- J. C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers, pages 61--74. MIT Press, 1999.Google Scholar
- G. Qiu, B. Liu, J. Bu, and C. Chen. Expanding domain sentiment lexicon through double propagation. In Proc. of the 21st International Jont Conference on Artifical Intelligence, pages 1199--1204. Morgan Kaufmann, 2009. Google ScholarDigital Library
- L. Qu, G. Ifrim, and G. Weikum. The bag-of-opinions method for review rating prediction from sparse text patterns. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 913--921, Beijing, China, August 2010. ACL. Google ScholarDigital Library
- E. Riloff, J. Wiebe, and T. Wilson. Learning subjective nouns using extraction pattern bootstrapping. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003, pages 25--32. ACL, 2003. Google ScholarDigital Library
- I. Titov and R. T. McDonald. A joint model of text and aspect ratings for sentiment summarization. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pages 308--316. ACL, 2008.Google Scholar
- P. D. Turney. Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pages 417--424. ACL, 2002. Google ScholarDigital Library
- P. D. Turney and M. L. Littman. Measuring praise and criticism: Inference of semantic orientation from association. ACM Trans. Inf. Syst., 21(4):315--346, 2003. Google ScholarDigital Library
- L. Velikovich, S. Blair-Goldensohn, K. Hannan, and R. McDonald. The viability of web-derived polarity lexicons. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 777--785. ACL, 2010. Google ScholarDigital Library
- C. Whitelaw, N. Garg, and S. Argamon. Using appraisal groups for sentiment analysis. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pages 625--631. ACM, 2005. Google ScholarDigital Library
- J. Wiebe, T. Wilson, and C. Cardie. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation, 39(2-3):165--210, 2005.Google ScholarCross Ref
- T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Computational Linguistics, 35(3):399--433, 2009. Google ScholarDigital Library
- R. Xia and C. Zong. A pos-based ensemble model for cross-domain sentiment classification. In Proceedings of 5th International Joint Conference on Natural Language Processing, pages 614--622. AFNLP, 2011.Google Scholar
- C. Yang, K. H.-Y. Lin, and H.-H. Chen. Building emotion lexicon from weblog corpora. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pages 133--136. ACL, 2007. Google ScholarDigital Library
- Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning, pages 412--420. Morgan Kaufmann, 1997. Google ScholarDigital Library
- C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 334--342. ACM, 2001. Google ScholarDigital Library
- Z. Zheng, X. Wu, and R. Srihari. Feature selection for text categorization on imbalanced data. SIGKDD Explor. Newsl., 6(1):80--89, 2004. Google ScholarDigital Library
Index Terms
- Generating contextualized sentiment lexica based on latent topics and user ratings
Recommendations
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementSentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Topic sentiment mixture: modeling facets and opinions in weblogs
WWW '07: Proceedings of the 16th international conference on World Wide WebIn this paper, we define the problem of topic-sentiment analysis on Weblogs and propose a novel probabilistic model to capture the mixture of topics and sentiments simultaneously. The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent ...
A Novel Hybrid HDP-LDA Model for Sentiment Analysis
WI-IAT '13: Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 01Sentiment analysis studies the public opinions towards an entity, and it is an important research area in data mining. Recently, a lot of sentiment analysis models have been proposed, including supervised and unsupervised approaches. However, the role ...
Comments