Skip to main content
Erschienen in: Discover Computing 3/2011

01.06.2011 | Web Mining for Search

Sentiment classification: a lexical similarity based approach for extracting subjectivity in documents

verfasst von: Kiran Sarvabhotla, Prasad Pingali, Vasudeva Varma

Erschienen in: Discover Computing | Ausgabe 3/2011

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With the growth of social media, document sentiment classification has become an active area of research in this decade. It can be viewed as a special case of topical classification applied only to subjective portions of a document (sources of sentiment). Hence, the key task in document sentiment classification is extracting subjectivity. Existing approaches to extract subjectivity rely heavily on linguistic resources such as sentiment lexicons and complex supervised patterns based on part-of-speech (POS) information. This makes the task of subjective feature extraction complex and resource dependent. In this work, we try to minimize the dependency on linguistic resources in sentiment classification. We propose a simple and statistical methodology called review summary (RSUMM) and use it in combination with well-known feature selection methods to extract subjectivity. Our experimental results on a movie review dataset prove the effectiveness of the proposed methodology.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Positive or negative.
 
2
Grading reviews typically use a scale of 1–5, i.e., a starred rating (*).
 
3
By sentiment classification we mean document sentiment classification.
 
4
A document can be a review or a blog post.
 
5
For example, NP, VP, JJ NN, RB JJ not NN, JJ JJ not NN, RB VB, NN JJ not NN, etc., where NN is a noun and RB stands for adverb, JJ stands for adjective, and VB stands for verb.
 
6
Support vector machines (SVM), naive Bayes, and maximum entropy-based classification.
 
7
The presence of negation, contrast transition, etc.
 
8
An annotated collection of documents.
 
9
An annotated collection of subjective and objective sentences.
 
Literatur
Zurück zum Zitat Argamon, S., Koppel, M., & Avneri, G. (1998). Routing documents according to style. In Proceedings of 1st international workshop on innovative information systems. Argamon, S., Koppel, M., & Avneri, G. (1998). Routing documents according to style. In Proceedings of 1st international workshop on innovative information systems.
Zurück zum Zitat Aue, A., & Gamon, M. (2005). Customizing sentiment classifiers to new domains: A case study. In Proceedings of the international conference RANLP-2005. Aue, A., & Gamon, M. (2005). Customizing sentiment classifiers to new domains: A case study. In Proceedings of the international conference RANLP-2005.
Zurück zum Zitat Baccianella, S., Esuli, A., & Sebastiani, F. (2009). Multi-facet rating of product reviews. In Proceedings of the 31th European conference on IR research on advances in information retrieval, ECIR ’09 (pp. 461–472). Springer-Verlag. Baccianella, S., Esuli, A., & Sebastiani, F. (2009). Multi-facet rating of product reviews. In Proceedings of the 31th European conference on IR research on advances in information retrieval, ECIR ’09 (pp. 461–472). Springer-Verlag.
Zurück zum Zitat Baeza-Yates, R. A., & Ribeiro-Neto, B. (1999). Modern information retrieval. Boston: Addison-Wesley Longman Baeza-Yates, R. A., & Ribeiro-Neto, B. (1999). Modern information retrieval. Boston: Addison-Wesley Longman
Zurück zum Zitat Beineke, P., Hastie, T., & Vaithyanathan, S. (2004). The sentimental factor: Improving review classification via human-provided information. In Proceedings of the 42nd annual meeting on association for computational linguistics, ACL ’04. Association for Computational Linguistics. Beineke, P., Hastie, T., & Vaithyanathan, S. (2004). The sentimental factor: Improving review classification via human-provided information. In Proceedings of the 42nd annual meeting on association for computational linguistics, ACL ’04. Association for Computational Linguistics.
Zurück zum Zitat Cui, H., Mittal, V., & Datar, M. (2006). Comparative experiments on sentiment classification for online product reviews. In Proceedings of the 21st national conference on artificial intelligence (Vol. 2, pp. 1265–1270). AAAI Press. Cui, H., Mittal, V., & Datar, M. (2006). Comparative experiments on sentiment classification for online product reviews. In Proceedings of the 21st national conference on artificial intelligence (Vol. 2, pp. 1265–1270). AAAI Press.
Zurück zum Zitat Gretzel, U., & Yoo, K. H. (2008). Use and impact of online travel reviews. Information and Communication Technologies in Tourism (pp. 35–46). Gretzel, U., & Yoo, K. H. (2008). Use and impact of online travel reviews. Information and Communication Technologies in Tourism (pp. 35–46).
Zurück zum Zitat Hatzivassiloglou, V., & McKeown, K. R. (1997). Predicting the semantic orientation of adjectives. In Proceedings of the 8th conference on European chapter of the association for computational linguistics (pp. 174–181). Association for Computational Linguistics. Hatzivassiloglou, V., & McKeown, K. R. (1997). Predicting the semantic orientation of adjectives. In Proceedings of the 8th conference on European chapter of the association for computational linguistics (pp. 174–181). Association for Computational Linguistics.
Zurück zum Zitat Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 168–177). KDD ’04. Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 168–177). KDD ’04.
Zurück zum Zitat Hu, Y., Lu, R., Li, X., Chen, Y., & Duan, J. (2007). A language modeling approach to sentiment analysis. In Proceedings of the 7th international conference on computational science, Part II, ICCS ’07 (pp. 1186–1193). Springer-Verlag. Hu, Y., Lu, R., Li, X., Chen, Y., & Duan, J. (2007). A language modeling approach to sentiment analysis. In Proceedings of the 7th international conference on computational science, Part II, ICCS ’07 (pp. 1186–1193). Springer-Verlag.
Zurück zum Zitat Kessler, B., Numberg, G., & Schütze, H. (1997). Automatic detection of text genre. In Proceedings of the 35th annual meeting of the association for computational linguistics and 8th conference of the European chapter of the association for computational linguistics, ACL-35 (pp. 32–38). Association for Computational Linguistics. Kessler, B., Numberg, G., & Schütze, H. (1997). Automatic detection of text genre. In Proceedings of the 35th annual meeting of the association for computational linguistics and 8th conference of the European chapter of the association for computational linguistics, ACL-35 (pp. 32–38). Association for Computational Linguistics.
Zurück zum Zitat Li, S., Lee, S. Y. M., Chen, Y., Huang, C.-R., & Zhou, G. (2010). Sentiment classification and polarity shifting. In Proceedings of the 23rd international conference on computational linguistics (Coling 2010) (pp. 635–643). Li, S., Lee, S. Y. M., Chen, Y., Huang, C.-R., & Zhou, G. (2010). Sentiment classification and polarity shifting. In Proceedings of the 23rd international conference on computational linguistics (Coling 2010) (pp. 635–643).
Zurück zum Zitat Liu, B. (2010). Sentiment analysis and subjectivity. In Handbook of natural language processing (2nd ed.). Boca Raton, FL: CRC Press, Taylor and Francis Group. Liu, B. (2010). Sentiment analysis and subjectivity. In Handbook of natural language processing (2nd ed.). Boca Raton, FL: CRC Press, Taylor and Francis Group.
Zurück zum Zitat Matsumoto, S., Takamura, H., & Okumura, M. (2005). Sentiment classification using word sub-sequences and dependency sub-trees. In Proceedings of PAKDD (pp. 301–311). Matsumoto, S., Takamura, H., & Okumura, M. (2005). Sentiment classification using word sub-sequences and dependency sub-trees. In Proceedings of PAKDD (pp. 301–311).
Zurück zum Zitat Mullen, T., & Collier, N. (2004). Sentiment analysis using support vector machines with diverse information sources. In Proceedings of EMNLP (pp. 412–418). Mullen, T., & Collier, N. (2004). Sentiment analysis using support vector machines with diverse information sources. In Proceedings of EMNLP (pp. 412–418).
Zurück zum Zitat Pang, B., & Lee, L. (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL (pp. 271–278). Pang, B., & Lee, L. (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL (pp. 271–278).
Zurück zum Zitat Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundation and trends in information retrieval, 2, 1–135. ISSN 1554-0669. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundation and trends in information retrieval, 2, 1–135. ISSN 1554-0669.
Zurück zum Zitat Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on empirical methods in natural language processing (Vol. 10, pp. 79–86). Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on empirical methods in natural language processing (Vol. 10, pp. 79–86).
Zurück zum Zitat Raychev, V., & Nakov, P. (2009). Language-independent sentiment analysis using subjectivity and positional information. In Proceedings of the international conference RANLP-2009 (pp. 360–364). Association for Computational Linguistics. Raychev, V., & Nakov, P. (2009). Language-independent sentiment analysis using subjectivity and positional information. In Proceedings of the international conference RANLP-2009 (pp. 360–364). Association for Computational Linguistics.
Zurück zum Zitat Tan, S., Cheng, X., Wang, Y., & Xu, H. (2009). Adapting naive bayes to domain adaptation for sentiment analysis. In Proceedings of the 31th European conference on IR research on advances in information retrieval, ECIR ’09 (pp. 337–349). Springer-Verlag. Tan, S., Cheng, X., Wang, Y., & Xu, H. (2009). Adapting naive bayes to domain adaptation for sentiment analysis. In Proceedings of the 31th European conference on IR research on advances in information retrieval, ECIR ’09 (pp. 337–349). Springer-Verlag.
Zurück zum Zitat Thet, T. T., Na, J.-C., & Khoo, C. S. (2008). Sentiment classification of movie reviews using multiple perspectives. In Proceedings of the 11th international conference on Asian digital libraries: Universal and ubiquitous access to information, ICADL 08 (pp. 184–193). Springer-Verlag. Thet, T. T., Na, J.-C., & Khoo, C. S. (2008). Sentiment classification of movie reviews using multiple perspectives. In Proceedings of the 11th international conference on Asian digital libraries: Universal and ubiquitous access to information, ICADL 08 (pp. 184–193). Springer-Verlag.
Zurück zum Zitat Turney, P. D. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting on association for computational linguistics, ACL ’02 (pp. 417–424). Association for Computational Linguistics. Turney, P. D. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting on association for computational linguistics, ACL ’02 (pp. 417–424). Association for Computational Linguistics.
Zurück zum Zitat Wang, S., Li, D., Wei, Y., & Li, H. (2009). A feature selection method based on fisher’s discriminant ratio for text sentiment classification. In Proceedings of the international conference on web information systems and mining, WISM ’09 (pp. 88–97). Springer-Verlag. Wang, S., Li, D., Wei, Y., & Li, H. (2009). A feature selection method based on fisher’s discriminant ratio for text sentiment classification. In Proceedings of the international conference on web information systems and mining, WISM ’09 (pp. 88–97). Springer-Verlag.
Zurück zum Zitat Whitelaw, C., Garg, N., & Argamon, S. (2005). Using appraisal groups for sentiment analysis. In Proceedings of the 14th ACM international conference on information and knowledge management, CIKM ’05 (pp. 625–631). ACM. Whitelaw, C., Garg, N., & Argamon, S. (2005). Using appraisal groups for sentiment analysis. In Proceedings of the 14th ACM international conference on information and knowledge management, CIKM ’05 (pp. 625–631). ACM.
Zurück zum Zitat Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Proceedings of the 14th international conference on machine learning, ICML ’97 (pp. 412–420). Morgan Kaufmann Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Proceedings of the 14th international conference on machine learning, ICML ’97 (pp. 412–420). Morgan Kaufmann
Metadaten
Titel
Sentiment classification: a lexical similarity based approach for extracting subjectivity in documents
verfasst von
Kiran Sarvabhotla
Prasad Pingali
Vasudeva Varma
Publikationsdatum
01.06.2011
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 3/2011
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-010-9161-5

Weitere Artikel der Ausgabe 3/2011

Discover Computing 3/2011 Zur Ausgabe