ABSTRACT
Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques.
- Agrawal, R. & Srikant, R. 1994. Fast algorithm for mining association rules. VLDB'94, 1994. Google ScholarDigital Library
- Boguraev, B., and Kennedy, C. 1997. Salience-Based Content Characterization of Text Documents. In Proc. of the ACL'97/EACL'97 Workshop on Intelligent Scalable Text Summarization.Google Scholar
- Bourigault, D. 1995. Lexter: A terminology extraction software for knowledge acquisition from texts. KAW'95.Google Scholar
- Bruce, R., and Wiebe, J. 2000. Recognizing Subjectivity: A Case Study of Manual Tagging. Natural Language Engineering. Google ScholarDigital Library
- Cardie, C., Wiebe, J., Wilson, T. and Litman, D. 2003. Combining Low-Level and Summary Representations of Opinions for Multi-Perspective Question Answering. 2003 AAAI Spring Symposium on New Directions in Question Answering.Google Scholar
- Church, K.W. and Hanks, P. 1990. Word Association Norms, Mutual Information and Lexicography. Computational Linguistics, 16(1):22--29. Google ScholarDigital Library
- Daille, B. 1996. Study and Implementation of Combined Techniques for Automatic Extraction of Terminology. The Balancing Act: Combining Symbolic and Statistical Approaches to Language. MIT Press, CambridgeGoogle Scholar
- Das, S. and Chen, M., 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. APFA'01.Google Scholar
- Dave, K., Lawrence, S., and Pennock, D., 2003. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. WWW'03. Google ScholarDigital Library
- DeJong, G. 1982. An Overview of the FRUMP System. Strategies for Natural Language Parsing. 149--176.Google Scholar
- FASTR. http://www.limsi.fr/Individu/jacquemi/FASTR/Google Scholar
- Fellbaum, C. 1998. WordNet: an Electronic Lexical Database, MIT Press.Google Scholar
- Finn, A. and Kushmerick, N. 2003. Learning to Classify Documents according to Genre. IJCAI-03 Workshop on Computational Approaches to Style Analysis and Synthesis.Google Scholar
- Finn, A., Kushmerick, N., and Smyth, B. 2002. Genre Classification and Domain Transfer for Information Filtering. In Proc. of European Colloquium on Information Retrieval Research, pages 353--362. Google ScholarDigital Library
- Goldstein, J., Kantrowitz, M., Mittal, V., and Carbonell, J. 1999. Summarizing Text Documents: Sentence Selection and Evaluation Metrics. SIGIR'99. Google ScholarDigital Library
- Hatzivassiloglou, V. and Mckeown, K., 1997. Predicting the Semantic Orientation of Adjectives. In Proc. of 35th ACL/8th EACL. Google ScholarDigital Library
- Hatzivassiloglou, V. and Wiebe, 2000. J. Effects of Adjective Orientation and Gradability on Sentence Subjectivity. COLING'00. Google ScholarDigital Library
- Hearst, M, 1992. Direction-based Text Interpretation as an Information Access Refinement. In Paul Jacobs, editor, Text-Based Intelligent Systems. Lawrence Erlbaum Associates. Google ScholarDigital Library
- Hu, M., and Liu, B. 2004. Mining Opinion Features in Customer Reviews. To appear in AAAI'04, 2004. Google ScholarDigital Library
- Huettner, A. and Subasic, P., 2000. Fuzzy Typing for Document Management. In ACL'00 Companion Volume: Tutorial Abstracts and Demonstration Notes.Google Scholar
- Jacquemin, C., and Bourigault, D. 2001. Term extraction and automatic indexing. In R. Mitkov, editor, Handbook of Computational Linguistics. Oxford University Press.Google Scholar
- Justeson, J. S., and Katz, S.M. 1995. Technical Terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1):9--27.Google ScholarCross Ref
- Karlgren, J. and Cutting, D. 1994. Recognizing Text Genres with Simple Metrics using Discriminant Analysis. COLING'94. Google ScholarDigital Library
- Kessler, B., Nunberg, G., and Schutze, H. 1997. Automatic Detection of Text Genre. In Proc. of 35th ACL/8th EACL. Google ScholarDigital Library
- Kupiec, J., Pedersen, J., and Chen, F. 1995. A Trainable Document Summarizer. SIGIR'1995 Google ScholarDigital Library
- Liu, B., Hsu, W., Ma, Y. 1998. Integrating Classification and Association Rule Mining. KDD'98, 1998.Google Scholar
- Mani, I., and Bloedorn, E., 1997. Multi-document Summarization by Graph Search and Matching. AAAI'97. Google ScholarDigital Library
- Manning, C. and Schutze, H. 1999. Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999. Google ScholarDigital Library
- Miller, G., Beckwith, R, Fellbaum, C., Gross, D., and Miller, K. 1990. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography (special issue), 3(4):235--312.Google Scholar
- Morinaga, S., Ya Yamanishi, K., Tateishi, K, and Fukushima, T. 2002. Mining Product Reputations on the Web. KDD'02. Google ScholarDigital Library
- NLProcessor - Text Analysis Toolkit. 2000. http://www.infogistics.com/textanalysis.htmlGoogle Scholar
- Paice, C. D. 1990. Constructing Literature Abstracts by Computer: Techniques and Prospects. Information Processing and Management 26:171--186. Google ScholarDigital Library
- Pang, B., Lee, L., and Vaithyanathan, S., 2002. Thumbs up? Sentiment Classification Using Machine Learning Techniques. In Proc. of EMNLP 2002 Google ScholarDigital Library
- Reimer, U. and Hahn, U. 1997. A Formal Model of Text Summarization based on Condensation Operators of a Terminological Logic. In Proceedings of ACL'97 Workshop on Intelligent, Scalable Text Summarization.Google Scholar
- Sack, W., 1994. On the Computation of Point of View. AAAI'94, Student abstract. Google ScholarDigital Library
- Salton, G. Singhal, A. Buckley, C. and Mitra, M. 1996. Automatic Text Decomposition using Text Segments and Text Themes. ACM Conference on Hypertext. Google ScholarDigital Library
- Sparck J. 1993a. Discourse Modeling for Automatic Text Summarizing. Technical Report 290, University of Cambridge Computer Laboratory.Google Scholar
- Sparck J. 1993b. What might be in a summary? Information Retrieval 93: 9--26.Google Scholar
- Tait, J. 1983. Automatic Summarizing of English Texts. Ph.D. Dissertation, University of Cambridge.Google Scholar
- Tetreault, J. 1999. Analysis of Syntax-Based Pronoun Resolution Methods. ACL'99. Google ScholarDigital Library
- Tong, R., 2001. An Operational System for Detecting and Tracking Opinions in on-line discussion. SIGIR 2001 Workshop on Operational Text Classification.Google Scholar
- Turney, P. 2002. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. ACL'02. Google ScholarDigital Library
- Wiebe, J. 2000. Learning Subjective Adjectives from Corpora. AAAI'00. Google ScholarDigital Library
- Wiebe, J., Bruce, R., and O'Hara, T. 1999. Development and Use of a Gold Standard Data Set for Subjectivity Classifications. In Proc. of ACL'99. Google ScholarDigital Library
Index Terms
- Mining and summarizing customer reviews
Recommendations
Intertopic information mining for query-based summarization
In this article, the authors address the problem of sentence ranking in summarization. Although most existing summarization approaches are concerned with the information embodied in a particular topic (including a set of documents and an associated ...
Opinion Mining and Summarization of Hotel Reviews
CICN '14: Proceedings of the 2014 International Conference on Computational Intelligence and Communication NetworksEveryday many users purchases product, book travel tickets, buy goods and services through web. Users also share their views about product, hotel, news, and topic on web in the form of reviews, blogs, comments etc. Many users read review information ...
Mining opinion features in customer reviews
AAAI'04: Proceedings of the 19th national conference on Artifical intelligenceIt is a common practice that merchants selling products on the Web ask their customers to review the products and associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. ...
Comments