Abstract
The evaluative character of a word is called its semantic orientation. Positive semantic orientation indicates praise (e.g., "honest", "intrepid") and negative semantic orientation indicates criticism (e.g., "disturbing", "superfluous"). Semantic orientation varies in both direction (positive or negative) and degree (mild to strong). An automated system for measuring semantic orientation would have application in text classification, text filtering, tracking opinions in online discussions, analysis of survey responses, and automated chat systems (chatbots). This article introduces a method for inferring the semantic orientation of a word from its statistical association with a set of positive and negative paradigm words. Two instances of this approach are evaluated, based on two different statistical measures of word association: pointwise mutual information (PMI) and latent semantic analysis (LSA). The method is experimentally tested with 3,596 words (including adjectives, adverbs, nouns, and verbs) that have been manually labeled positive (1,614 words) and negative (1,982 words). The method attains an accuracy of 82.8% on the full test set, but the accuracy rises above 95% when the algorithm is allowed to abstain from classifying mild words.
- Agresti, A. 1996. An Introduction to Categorical Data Analysis. Wiley, New York.]]Google Scholar
- Bartell, B. T., Cottrell, G. W., and Belew, R. K. 1992. Latent semantic indexing is an optimal special case of multidimensional scaling. Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 161--167.]] Google ScholarDigital Library
- Budanitsky, A. and Hirst, G. 2001. Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures. In Workshop on WordNet and Other Lexical Resources, Second meeting of the North American Chapter of the Association for Computational Linguistics (Pittsburgh, Pa).]]Google Scholar
- Church, K. W. and Hanks, P. 1989. Word association norms, mutual information and lexicography. In Proceedings of the 27th Annual Conference of the Association of Computational Linguistics. Association for Computational Linguistics, New Brunswick, N.J., 76--83.]] Google ScholarDigital Library
- Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R. 1990. Indexing by latent semantic analysis. J. Amer. Soc. Inf. Sci. 41, 6, 391--407.]]Google ScholarCross Ref
- Dunning, T. 1993. Accurate methods for the statistics of surprise and coincidence. Comput. Ling. 19, 61--74.]] Google ScholarDigital Library
- Firth, J. R. 1957. A synopsis of linguistic theory 1930--1955. In Studies in Linguistic Analysis, Philological Society, Oxford, England, 1--32. (Reprinted in F. R. Palmer (ed.), Selected Papers of J. R. Firth 1952--1959, Longman, London, England, 1968).]]Google Scholar
- Golub, G. H. and Van Loan, C. F. 1996. Matrix Computations. Third edition. Johns Hopkins University Press, Baltimore, Md.]] Google ScholarDigital Library
- Hatzivassiloglou, V. and McKeown, K. R. 1997. Predicting the semantic orientation of adjectives. Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and the 8th Conference of the European Chapter of the ACL. Association for Computational Linguistics, New Brunswick, N.J., 174--181.]] Google ScholarDigital Library
- Hatzivassiloglou, V. and Wiebe, J. M. 2000. Effects of adjective orientation and gradability on sentence subjectivity. Proceedings of 18th International Conference on Computational Linguistics. Association for Computational Linguistics, New Brunswick, N.J.]] Google ScholarDigital Library
- Hearst, M. A. 1992. Direction-based text interpretation as an information access refinement. In Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval. P. Jacobs, Ed. Lawrence Erlbaum Associates, Mahwah, N.J.]] Google ScholarDigital Library
- Kamps, J. and Marx, M. 2002. Words with attitude. In Proceedings of the 1st International Conference on Global WordNet. CIIL, Mysore, India, 332--341.]]Google Scholar
- Landauer, T. K. 2002. On the computational basis of learning and cognition: Arguments from LSA. In The Psychology of Learning and Motivation, B. H. Ross, Ed. To appear.]]Google Scholar
- Landauer, T. K. and Dumais, S. T. 1997. A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psych. Rev. 104, 211--240.]]Google ScholarCross Ref
- Littman, M. L. 2001. Language games and other meaningful pursuits. Presentation slides. (http://www.cs.rutgers.edu/∼mlittman/talks/CA-lang.ppt).]]Google Scholar
- Manning, C. D. and Schütze, H. 1999. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, Mass.]] Google ScholarDigital Library
- Miller, G. A. 1990. WordNet: An on-line lexical database. Int. J. Lexicog. 3, 4, 235--312.]]Google ScholarCross Ref
- Osgood, C. E., Suci, G. J., and Tannenbaum, P. H. 1957. The Measurement of Meaning. University of Illinois Press, Chicago, Ill.]]Google Scholar
- Pang, B., Lee, L., and Vaithyanathan, S. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing. 79--86.]] Google ScholarDigital Library
- Resnik, P. 1995. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence. Morgan-Kaufmann, San Mateo, Calif., 448--453.]]Google Scholar
- Schütze, H. 1993. Word space. In Advances in Neural Information Processing Systems 5. S. J. Hanson, J. D. Cowan, and C. L. Giles, Ed. Morgan-Kaufmann., San Mateo, Calif., 895--902.]] Google ScholarDigital Library
- Smadja, F. 1993. Retrieving collocations from Text: Xtract. Comput. Ling. 19, 143--177.]] Google ScholarDigital Library
- Spertus, E. 1997. Smokey: Automatic recognition of hostile messages. In Proceedings of the Conference on Innovative Applications of Artificial Intelligence. AAAI Press, Menlo Park, Calif., 1058--1065.]]Google Scholar
- Stone, P. J., Dunphy, D. C., Smith, M. S., and Ogilvie, D. M. 1966. The General Inquirer: A Computer Approach to Content Analysis. MIT Press, Cambridge, Mass.]]Google Scholar
- Tong, R. M. 2001. An operational system for detecting and tracking opinions in on-line discussions. Working Notes of the ACM SIGIR 2001 Workshop on Operational Text Classification. ACM, New York, 1--6.]]Google Scholar
- Turney, P. D. 2001. Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the 12th European Conference on Machine Learning. Springer-Verlag, Berlin, Germany, 491--502.]] Google ScholarDigital Library
- Turney, P. D. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the Association for Computational Linguistics 40th Anniversary Meeting. Association for Computational Linguistics, New Brunswick, N.J.]] Google ScholarDigital Library
- van Rijsbergen, C. J. 1979. Information Retrieval (2nd edition), Butterworths, London, England.]] Google ScholarDigital Library
- Wiebe, J. M. 2000. Learning subjective adjectives from corpora. In Proceedings of the 17th National Conference on Artificial Intelligence. AAAI Press, Menlo Park, Calif.]] Google ScholarDigital Library
- Wiebe, J. M., Bruce, R., Bell, M., Martin, M., & Wilson, T. 2001. A corpus study of evaluative and speculative language. In Proceedings of the 2nd ACL SIG on Dialogue Workshop on Discourse and Dialogue (Aalborg, Denmark).]] Google ScholarDigital Library
Index Terms
- Measuring praise and criticism: Inference of semantic orientation from association
Recommendations
Identifying the semantic orientation of terms using S-HAL for sentiment analysis
Sentiment analysis continues to be a most important research problem due to its abundant applications. Identifying the semantic orientation of subjective terms (words or phrases) is a fundamental task for sentiment analysis. In this paper, we propose a ...
Domain Specific Sentence Level Mood Extraction from Malayalam Text
ICACC '12: Proceedings of the 2012 International Conference on Advances in Computing and CommunicationsNatural Language Processing (NLP) is a field which studies the interactions between computers and natural languages. NLP is used to enable computers to attain the capability of manipulating natural languages with a level of expertise equivalent to ...
Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis
ECIR'07: Proceedings of the 29th European conference on IR researchIn this paper we present an approach to extract sentiments associated with a phrase or sentence. Sentiment analysis has been attempted mostly for documents typically a review or a news item. Conjunctions have a substantial impact on the overall ...
Comments