skip to main content
10.1145/1014052.1014073acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Mining and summarizing customer reviews

Published:22 August 2004Publication History

ABSTRACT

Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques.

References

  1. Agrawal, R. & Srikant, R. 1994. Fast algorithm for mining association rules. VLDB'94, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Boguraev, B., and Kennedy, C. 1997. Salience-Based Content Characterization of Text Documents. In Proc. of the ACL'97/EACL'97 Workshop on Intelligent Scalable Text Summarization.Google ScholarGoogle Scholar
  3. Bourigault, D. 1995. Lexter: A terminology extraction software for knowledge acquisition from texts. KAW'95.Google ScholarGoogle Scholar
  4. Bruce, R., and Wiebe, J. 2000. Recognizing Subjectivity: A Case Study of Manual Tagging. Natural Language Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cardie, C., Wiebe, J., Wilson, T. and Litman, D. 2003. Combining Low-Level and Summary Representations of Opinions for Multi-Perspective Question Answering. 2003 AAAI Spring Symposium on New Directions in Question Answering.Google ScholarGoogle Scholar
  6. Church, K.W. and Hanks, P. 1990. Word Association Norms, Mutual Information and Lexicography. Computational Linguistics, 16(1):22--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Daille, B. 1996. Study and Implementation of Combined Techniques for Automatic Extraction of Terminology. The Balancing Act: Combining Symbolic and Statistical Approaches to Language. MIT Press, CambridgeGoogle ScholarGoogle Scholar
  8. Das, S. and Chen, M., 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. APFA'01.Google ScholarGoogle Scholar
  9. Dave, K., Lawrence, S., and Pennock, D., 2003. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. WWW'03. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. DeJong, G. 1982. An Overview of the FRUMP System. Strategies for Natural Language Parsing. 149--176.Google ScholarGoogle Scholar
  11. FASTR. http://www.limsi.fr/Individu/jacquemi/FASTR/Google ScholarGoogle Scholar
  12. Fellbaum, C. 1998. WordNet: an Electronic Lexical Database, MIT Press.Google ScholarGoogle Scholar
  13. Finn, A. and Kushmerick, N. 2003. Learning to Classify Documents according to Genre. IJCAI-03 Workshop on Computational Approaches to Style Analysis and Synthesis.Google ScholarGoogle Scholar
  14. Finn, A., Kushmerick, N., and Smyth, B. 2002. Genre Classification and Domain Transfer for Information Filtering. In Proc. of European Colloquium on Information Retrieval Research, pages 353--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Goldstein, J., Kantrowitz, M., Mittal, V., and Carbonell, J. 1999. Summarizing Text Documents: Sentence Selection and Evaluation Metrics. SIGIR'99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hatzivassiloglou, V. and Mckeown, K., 1997. Predicting the Semantic Orientation of Adjectives. In Proc. of 35th ACL/8th EACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Hatzivassiloglou, V. and Wiebe, 2000. J. Effects of Adjective Orientation and Gradability on Sentence Subjectivity. COLING'00. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hearst, M, 1992. Direction-based Text Interpretation as an Information Access Refinement. In Paul Jacobs, editor, Text-Based Intelligent Systems. Lawrence Erlbaum Associates. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hu, M., and Liu, B. 2004. Mining Opinion Features in Customer Reviews. To appear in AAAI'04, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Huettner, A. and Subasic, P., 2000. Fuzzy Typing for Document Management. In ACL'00 Companion Volume: Tutorial Abstracts and Demonstration Notes.Google ScholarGoogle Scholar
  21. Jacquemin, C., and Bourigault, D. 2001. Term extraction and automatic indexing. In R. Mitkov, editor, Handbook of Computational Linguistics. Oxford University Press.Google ScholarGoogle Scholar
  22. Justeson, J. S., and Katz, S.M. 1995. Technical Terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1):9--27.Google ScholarGoogle ScholarCross RefCross Ref
  23. Karlgren, J. and Cutting, D. 1994. Recognizing Text Genres with Simple Metrics using Discriminant Analysis. COLING'94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kessler, B., Nunberg, G., and Schutze, H. 1997. Automatic Detection of Text Genre. In Proc. of 35th ACL/8th EACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kupiec, J., Pedersen, J., and Chen, F. 1995. A Trainable Document Summarizer. SIGIR'1995 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Liu, B., Hsu, W., Ma, Y. 1998. Integrating Classification and Association Rule Mining. KDD'98, 1998.Google ScholarGoogle Scholar
  27. Mani, I., and Bloedorn, E., 1997. Multi-document Summarization by Graph Search and Matching. AAAI'97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Manning, C. and Schutze, H. 1999. Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Miller, G., Beckwith, R, Fellbaum, C., Gross, D., and Miller, K. 1990. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography (special issue), 3(4):235--312.Google ScholarGoogle Scholar
  30. Morinaga, S., Ya Yamanishi, K., Tateishi, K, and Fukushima, T. 2002. Mining Product Reputations on the Web. KDD'02. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. NLProcessor - Text Analysis Toolkit. 2000. http://www.infogistics.com/textanalysis.htmlGoogle ScholarGoogle Scholar
  32. Paice, C. D. 1990. Constructing Literature Abstracts by Computer: Techniques and Prospects. Information Processing and Management 26:171--186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Pang, B., Lee, L., and Vaithyanathan, S., 2002. Thumbs up? Sentiment Classification Using Machine Learning Techniques. In Proc. of EMNLP 2002 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Reimer, U. and Hahn, U. 1997. A Formal Model of Text Summarization based on Condensation Operators of a Terminological Logic. In Proceedings of ACL'97 Workshop on Intelligent, Scalable Text Summarization.Google ScholarGoogle Scholar
  35. Sack, W., 1994. On the Computation of Point of View. AAAI'94, Student abstract. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Salton, G. Singhal, A. Buckley, C. and Mitra, M. 1996. Automatic Text Decomposition using Text Segments and Text Themes. ACM Conference on Hypertext. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Sparck J. 1993a. Discourse Modeling for Automatic Text Summarizing. Technical Report 290, University of Cambridge Computer Laboratory.Google ScholarGoogle Scholar
  38. Sparck J. 1993b. What might be in a summary? Information Retrieval 93: 9--26.Google ScholarGoogle Scholar
  39. Tait, J. 1983. Automatic Summarizing of English Texts. Ph.D. Dissertation, University of Cambridge.Google ScholarGoogle Scholar
  40. Tetreault, J. 1999. Analysis of Syntax-Based Pronoun Resolution Methods. ACL'99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Tong, R., 2001. An Operational System for Detecting and Tracking Opinions in on-line discussion. SIGIR 2001 Workshop on Operational Text Classification.Google ScholarGoogle Scholar
  42. Turney, P. 2002. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. ACL'02. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Wiebe, J. 2000. Learning Subjective Adjectives from Corpora. AAAI'00. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Wiebe, J., Bruce, R., and O'Hara, T. 1999. Development and Use of a Gold Standard Data Set for Subjectivity Classifications. In Proc. of ACL'99. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Mining and summarizing customer reviews

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
        August 2004
        874 pages
        ISBN:1581138881
        DOI:10.1145/1014052

        Copyright © 2004 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 August 2004

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader