research-article

Sentiment analysis of blogs by combining lexical knowledge with text classification

Authors:
Prem Melville

IBM Research, Yorktown Heights, NY, USA

IBM Research, Yorktown Heights, NY, USA
View Profile

,
Wojciech Gryc

Oxford University, Oxford, United Kingdom

Oxford University, Oxford, United Kingdom
View Profile

,
Richard D. Lawrence

IBM Research, Yorktown Heights, NY, USA

IBM Research, Yorktown Heights, NY, USA
View Profile

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data miningJune 2009Pages 1275–1284https://doi.org/10.1145/1557019.1557156

Published:28 June 2009Publication History

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 1275–1284

ABSTRACT

The explosion of user-generated content on the Web has led to new opportunities and significant challenges for companies, that are increasingly concerned about monitoring the discussion around their products. Tracking such discussion on weblogs, provides useful insight on how to improve products or market them more effectively. An important component of such analysis is to characterize the sentiment expressed in blogs about specific brands and products. Sentiment Analysis focuses on this task of automatically identifying whether a piece of text expresses a positive or negative opinion about the subject matter. Most previous work in this area uses prior lexical knowledge in terms of the sentiment-polarity of words. In contrast, some recent approaches treat the task as a text classification problem, where they learn to classify sentiment based only on labeled training data. In this paper, we present a unified framework in which one can use background lexical information in terms of word-class associations, and refine this information for specific domains using any available training examples. Empirical results on diverse domains show that our approach performs better than using background knowledge or training data in isolation, as well as alternative approaches to using lexical knowledge with text classification.

References

R. Agrawal, R. J. B. Jr., and R. Srikant. Athena: Mining-based interactive management of text databases. In Extending Database Technology, 2000. Google ScholarDigital Library
Blogpulse: A service of nielsen buzzmetrics. http://www.blogpulse.com/.Google Scholar
R. T. Clemen and R. L. Winkler. Combining probability distributions from experts in risk analysis. Risk Analysis, 19:187--203, 1999.Google ScholarCross Ref
W. Dai, G.-R. Xue, Q. Yang, and Y. Yu. Transferring naive Bayes classifiers for text classification. In AAAI, 2007.Google ScholarDigital Library
S. Das and M. Chen. Yahoo! for Amazon: Extracting market sentiment from stock message boards. In Asia Pacific Finance Association, 2001.Google Scholar
A. Dayanik, D. D. Lewis, D. Madigan, V. Menkov, and A. Genkin. Constructing informative prior distributions from domain knowledge in text classification. In SIGIR, 2006. Google ScholarDigital Library
G. Druck, G. Mann, and A. McCallum. Learning from labeled features using generalized expectation criteria. In SIGIR, 2008. Google ScholarDigital Library
K. T. Durant and M. D. Smith. Advances in Web Mining and Web Usage Analysis, chapter Predicting the Political Sentiment of Web Log Posts Using Supervised Machine Learning Techniques Coupled with Feature Selection. Springer, 2007. Google ScholarDigital Library
Extracting the main content from a webpage. http://w-shadow.com/blog/2008/01/25/extracting-the-main-content-from-a-webpage/.Google Scholar
S. French. Group consensus probability distributions: A critical survey. In Bayesian Statistics 2, pages 183--197. North-Holland, 1985.Google Scholar
C. Genest and J. V. Zidek. Combining probability distributions: A critique and an annotated bibliography. Statistical Science, 1:114--135, 1986.Google ScholarCross Ref
M. Hu and B. Liu. Mining and summarizing customer reviews. In KDD, 2004. Google ScholarDigital Library
S.-M. Kim and E. Hovy. Determining the sentiment of opinions. In COLING, 2004. Google ScholarDigital Library
B. Liu. Web Data Mining. Springer, 2007.Google Scholar
B. Liu, X. Li, W. S. Lee, and P. Yu. Text classification by labeling words. In AAAI, 2004. Google ScholarDigital Library
A. McCallum and K. Nigam. A comparison of event models for naive Bayes text classification. In AAAI Workshop on Text Categorization, 1998.Google Scholar
V. Ng, S. Dasgupta, and S. M. N. Arifin. Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews. In ACL, 2006. Google ScholarDigital Library
K. Nigam. Using Unlabeled Data to Improve Text Classification. PhD thesis, Carnegie Mellon University, 2001. Google ScholarDigital Library
B. Pang and L. Lee. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In ACL, 2004. Google ScholarDigital Library
B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? sentiment classification using machine learning techniques. In EMNLP, 2002. Google ScholarDigital Library
M. F. Porter. An algorithm for suffix stripping, pages 313--316. Morgan Kaufmann Publishers Inc., 1997. Google ScholarDigital Library
G. Ramakrishnan, A. Jadhav, A. Joshi, S. Chakrabarti, and P. Bhattacharyya. Question answering via Bayesian inference on lexical relations. In ACL Workshop on Multilingual Summarization and Question Answering, 2003. Google ScholarDigital Library
R. E. Schapire. The strength of weak learnability. Machine Learning, 5(2):197--227, 1990. Google ScholarDigital Library
R. E. Schapire, M. Rochery, M. G. Rahim, and N. Gupta. Incorporating prior knowledge into boosting. In ICML, 2002. Google ScholarDigital Library
J. Shavlik. A framework for combining symbolic and neural learning. In Machine Learning, 1992. Google ScholarDigital Library
V. Sindhwani and P. Melville. Document-word co-regularization for semi-supervised sentiment analysis. In ICDM, 2008. Google ScholarDigital Library
S. Spangler, Y. Chen, L. Proctor, A. Lelescu, A. Behal, B. He, T. Griffin, A. Liu, B. Wade, and T. Davis. COBRA-Mining Web for Corporate Brand and Reputation Analysis. IEEE International Conference on Web Intelligence, 2007. Google ScholarDigital Library
P. Turney. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. ACL, 2002. Google ScholarDigital Library
T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In EMNLP, 2005. Google ScholarDigital Library
R. L. Winkler. The consensus of subjective probability distributions. Management Science, 15:361--375, 1968.Google ScholarDigital Library
X. Wu and R. Srihari. Incorporating prior knowledge with weighted margin support vector machines. In KDD, 2004. Google ScholarDigital Library
H. Yu and V. Hatzivassiloglou. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. In EMNLP, 2003. Google ScholarDigital Library
L. Zhuang, F. Jing, and X.-Y. Zhu. Movie review mining and summarization. In CIKM, 2006. Google ScholarDigital Library

Index Terms

Sentiment analysis of blogs by combining lexical knowledge with text classification
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Read More
Holistic approaches to identifying the sentiment of blogs using opinion words
WISE'11: Proceedings of the 12th international conference on Web information system engineering

Sentiment analysis aims to identify the orientation (positive or negative) of opinions or emotions expressed in documents. Opinion lexicons comprise opinion words expressing prior positive or negative sentiments. In most previous work documents are ...
Read More
Social sentiment sensor: a visualization system for topic detection and topic sentiment analysis on microblog

As a new form of social media, microblogging provides platform sharing, wherein users can share their feelings and ideas on certain topics. Bursty topics from microblogs are the results of the emerging issues that instantly attract more followers and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
June 2009
1426 pages
ISBN:9781605584959
DOI:10.1145/1557019
General Chairs:
John Elder
Elder Research, Inc., USA
,
Françoise Soulié Fogelman
KXEN, France
,
Program Chairs:
Peter Flach
University of Bristol, UK
,
Mohammed Zaki
RPI, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
background knowledge
blog analysis
dual supervision
movie reviews
naive bayes
opinion mining
political blogs
prior knowledge
sentiment analysis
technology blogs
text mining
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 311
  Total Citations
  View Citations
- 4,301
  Total Downloads
- Downloads (Last 12 months)90
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Sentiment analysis of blogs by combining lexical knowledge with text classification

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Joint sentiment/topic model for sentiment analysis

Holistic approaches to identifying the sentiment of blogs using opinion words

Social sentiment sensor: a visualization system for topic detection and topic sentiment analysis on microblog

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Sentiment analysis of blogs by combining lexical knowledge with text classification

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Joint sentiment/topic model for sentiment analysis

Holistic approaches to identifying the sentiment of blogs using opinion words

Social sentiment sensor: a visualization system for topic detection and topic sentiment analysis on microblog

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media