Article

Mining and summarizing customer reviews

Authors:
Minqing Hu

University of Illinois at Chicago, Chicago, IL

University of Illinois at Chicago, Chicago, IL
View Profile

,
Bing Liu

University of Illinois at Chicago, Chicago, IL

University of Illinois at Chicago, Chicago, IL
View Profile

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2004Pages 168–177https://doi.org/10.1145/1014052.1014073

Published:22 August 2004Publication History

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 168–177

ABSTRACT

Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques.

References

Agrawal, R. & Srikant, R. 1994. Fast algorithm for mining association rules. VLDB'94, 1994. Google ScholarDigital Library
Boguraev, B., and Kennedy, C. 1997. Salience-Based Content Characterization of Text Documents. In Proc. of the ACL'97/EACL'97 Workshop on Intelligent Scalable Text Summarization.Google Scholar
Bourigault, D. 1995. Lexter: A terminology extraction software for knowledge acquisition from texts. KAW'95.Google Scholar
Bruce, R., and Wiebe, J. 2000. Recognizing Subjectivity: A Case Study of Manual Tagging. Natural Language Engineering. Google ScholarDigital Library
Cardie, C., Wiebe, J., Wilson, T. and Litman, D. 2003. Combining Low-Level and Summary Representations of Opinions for Multi-Perspective Question Answering. 2003 AAAI Spring Symposium on New Directions in Question Answering.Google Scholar
Church, K.W. and Hanks, P. 1990. Word Association Norms, Mutual Information and Lexicography. Computational Linguistics, 16(1):22--29. Google ScholarDigital Library
Daille, B. 1996. Study and Implementation of Combined Techniques for Automatic Extraction of Terminology. The Balancing Act: Combining Symbolic and Statistical Approaches to Language. MIT Press, CambridgeGoogle Scholar
Das, S. and Chen, M., 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. APFA'01.Google Scholar
Dave, K., Lawrence, S., and Pennock, D., 2003. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. WWW'03. Google ScholarDigital Library
DeJong, G. 1982. An Overview of the FRUMP System. Strategies for Natural Language Parsing. 149--176.Google Scholar
FASTR. http://www.limsi.fr/Individu/jacquemi/FASTR/Google Scholar
Fellbaum, C. 1998. WordNet: an Electronic Lexical Database, MIT Press.Google Scholar
Finn, A. and Kushmerick, N. 2003. Learning to Classify Documents according to Genre. IJCAI-03 Workshop on Computational Approaches to Style Analysis and Synthesis.Google Scholar
Finn, A., Kushmerick, N., and Smyth, B. 2002. Genre Classification and Domain Transfer for Information Filtering. In Proc. of European Colloquium on Information Retrieval Research, pages 353--362. Google ScholarDigital Library
Goldstein, J., Kantrowitz, M., Mittal, V., and Carbonell, J. 1999. Summarizing Text Documents: Sentence Selection and Evaluation Metrics. SIGIR'99. Google ScholarDigital Library
Hatzivassiloglou, V. and Mckeown, K., 1997. Predicting the Semantic Orientation of Adjectives. In Proc. of 35th ACL/8th EACL. Google ScholarDigital Library
Hatzivassiloglou, V. and Wiebe, 2000. J. Effects of Adjective Orientation and Gradability on Sentence Subjectivity. COLING'00. Google ScholarDigital Library
Hearst, M, 1992. Direction-based Text Interpretation as an Information Access Refinement. In Paul Jacobs, editor, Text-Based Intelligent Systems. Lawrence Erlbaum Associates. Google ScholarDigital Library
Hu, M., and Liu, B. 2004. Mining Opinion Features in Customer Reviews. To appear in AAAI'04, 2004. Google ScholarDigital Library
Huettner, A. and Subasic, P., 2000. Fuzzy Typing for Document Management. In ACL'00 Companion Volume: Tutorial Abstracts and Demonstration Notes.Google Scholar
Jacquemin, C., and Bourigault, D. 2001. Term extraction and automatic indexing. In R. Mitkov, editor, Handbook of Computational Linguistics. Oxford University Press.Google Scholar
Justeson, J. S., and Katz, S.M. 1995. Technical Terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1):9--27.Google ScholarCross Ref
Karlgren, J. and Cutting, D. 1994. Recognizing Text Genres with Simple Metrics using Discriminant Analysis. COLING'94. Google ScholarDigital Library
Kessler, B., Nunberg, G., and Schutze, H. 1997. Automatic Detection of Text Genre. In Proc. of 35th ACL/8th EACL. Google ScholarDigital Library
Kupiec, J., Pedersen, J., and Chen, F. 1995. A Trainable Document Summarizer. SIGIR'1995 Google ScholarDigital Library
Liu, B., Hsu, W., Ma, Y. 1998. Integrating Classification and Association Rule Mining. KDD'98, 1998.Google Scholar
Mani, I., and Bloedorn, E., 1997. Multi-document Summarization by Graph Search and Matching. AAAI'97. Google ScholarDigital Library
Manning, C. and Schutze, H. 1999. Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999. Google ScholarDigital Library
Miller, G., Beckwith, R, Fellbaum, C., Gross, D., and Miller, K. 1990. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography (special issue), 3(4):235--312.Google Scholar
Morinaga, S., Ya Yamanishi, K., Tateishi, K, and Fukushima, T. 2002. Mining Product Reputations on the Web. KDD'02. Google ScholarDigital Library
NLProcessor - Text Analysis Toolkit. 2000. http://www.infogistics.com/textanalysis.htmlGoogle Scholar
Paice, C. D. 1990. Constructing Literature Abstracts by Computer: Techniques and Prospects. Information Processing and Management 26:171--186. Google ScholarDigital Library
Pang, B., Lee, L., and Vaithyanathan, S., 2002. Thumbs up? Sentiment Classification Using Machine Learning Techniques. In Proc. of EMNLP 2002 Google ScholarDigital Library
Reimer, U. and Hahn, U. 1997. A Formal Model of Text Summarization based on Condensation Operators of a Terminological Logic. In Proceedings of ACL'97 Workshop on Intelligent, Scalable Text Summarization.Google Scholar
Sack, W., 1994. On the Computation of Point of View. AAAI'94, Student abstract. Google ScholarDigital Library
Salton, G. Singhal, A. Buckley, C. and Mitra, M. 1996. Automatic Text Decomposition using Text Segments and Text Themes. ACM Conference on Hypertext. Google ScholarDigital Library
Sparck J. 1993a. Discourse Modeling for Automatic Text Summarizing. Technical Report 290, University of Cambridge Computer Laboratory.Google Scholar
Sparck J. 1993b. What might be in a summary? Information Retrieval 93: 9--26.Google Scholar
Tait, J. 1983. Automatic Summarizing of English Texts. Ph.D. Dissertation, University of Cambridge.Google Scholar
Tetreault, J. 1999. Analysis of Syntax-Based Pronoun Resolution Methods. ACL'99. Google ScholarDigital Library
Tong, R., 2001. An Operational System for Detecting and Tracking Opinions in on-line discussion. SIGIR 2001 Workshop on Operational Text Classification.Google Scholar
Turney, P. 2002. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. ACL'02. Google ScholarDigital Library
Wiebe, J. 2000. Learning Subjective Adjectives from Corpora. AAAI'00. Google ScholarDigital Library
Wiebe, J., Bruce, R., and O'Hara, T. 1999. Development and Use of a Gold Standard Data Set for Subjectivity Classifications. In Proc. of ACL'99. Google ScholarDigital Library

Index Terms

Mining and summarizing customer reviews
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Intertopic information mining for query-based summarization

In this article, the authors address the problem of sentence ranking in summarization. Although most existing summarization approaches are concerned with the information embodied in a particular topic (including a set of documents and an associated ...
Read More
Opinion Mining and Summarization of Hotel Reviews
CICN '14: Proceedings of the 2014 International Conference on Computational Intelligence and Communication Networks

Everyday many users purchases product, book travel tickets, buy goods and services through web. Users also share their views about product, hotel, news, and topic on web in the form of reviews, blogs, comments etc. Many users read review information ...
Read More
Mining opinion features in customer reviews
AAAI'04: Proceedings of the 19th national conference on Artifical intelligence

It is a common practice that merchants selling products on the Web ask their customers to review the products and associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
August 2004
874 pages
ISBN:1581138881
DOI:10.1145/1014052
General Chairs:
Won Kim
Cyber Database Solutions
,
Ronny Kohavi
Amazon.com
,
Program Chairs:
Johannes Gehrke
Cornell University
,
William DuMouchel
AT&T Labs Research
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 August 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
reviews
sentiment classification
summarization
text mining
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3,900
  Total Citations
  View Citations
- 20,977
  Total Downloads
- Downloads (Last 12 months)1,101
- Downloads (Last 6 weeks)158
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Mining and summarizing customer reviews

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Intertopic information mining for query-based summarization

Opinion Mining and Summarization of Hotel Reviews

Mining opinion features in customer reviews