Article

Mining product reputations on the Web

Authors:
Satoshi Morinaga

NEC Corporation, 4-1-1, Miyazaki, Miyamae, Kawasaki, Kanagawa, 216-8555, JAPAN

NEC Corporation, 4-1-1, Miyazaki, Miyamae, Kawasaki, Kanagawa, 216-8555, JAPAN
View Profile

,
Kenji Yamanishi

NEC Corporation, 4-1-1, Miyazaki, Miyamae, Kawasaki, Kanagawa, 216-8555, JAPAN

NEC Corporation, 4-1-1, Miyazaki, Miyamae, Kawasaki, Kanagawa, 216-8555, JAPAN
View Profile

,
Kenji Tateishi

NEC Corporation, 8916-47, Takayama-cho, Ikoma, Nara, 630-0101, JAPAN

NEC Corporation, 8916-47, Takayama-cho, Ikoma, Nara, 630-0101, JAPAN
View Profile

,
Toshikazu Fukushima

NEC Corporation, 8916-47, Takayama-cho, Ikoma, Nara, 630-0101, JAPAN

NEC Corporation, 8916-47, Takayama-cho, Ikoma, Nara, 630-0101, JAPAN
View Profile

KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data miningJuly 2002Pages 341–349https://doi.org/10.1145/775047.775098

Published:23 July 2002Publication History

KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 341–349

ABSTRACT

Knowing the reputations of your own and/or competitors' products is important for marketing and customer relationship management. It is, however, very costly to collect and analyze survey data manually. This paper presents a new framework for mining product reputations on the Internet. It automatically collects people's opinions about target products from Web pages, and it uses text mining techniques to obtain the reputations of those products.On the basis of human-test samples, we generate in advance syntactic and linguistic rules to determine whether any given statement is an opinion or not, as well as whether such any opinion is positive or negative in nature. We first collect statements regarding target products using a general search engine, and then, using the rules, extract opinions from among them and attach three labels to each opinion, labels indicating the positive/negative determination, the product name itself, and an numerical value expressing the degree of system confidence that the statement is, in fact, an opinion. The labeled opinions are then input into an opinion database.The mining of reputations, i.e., the finding of statistically meaningful information included in the database, is then conducted. We specify target categories using label values (such as positive opinions of product A) and perform four types of text mining: extraction of 1) characteristic words, 2) co-occurrence words, 3) typical sentences, for individual target categories, and 4) correspondence analysis among multiple target categories.Actual marketing data is used to demonstrate the validity and effectiveness of the framework, which offers a drastic reduction in the overall cost of reputation analysis over that of conventional survey approaches and supports the discovery of knowledge from the pool of opinions on the web.

References

B. Adelberg, Nodose - a tool for semi-automatically extracting structured and semistructured data from text documents, in Proc. of the 1998 ACM SIGMOD International Conference on Management of Data, pp:283--294, 1998. Google ScholarDigital Library
R. Agrawal and R. Srikant, Fast algorithms for mining association rules, in Proc. 1994 Int'l. Conf. Very Large Data Bases (VLDB), pp:487--499, 1994. Google ScholarDigital Library
M.R. Anderberg, Cluster Analysis for Applications, Academic Press, 1973.Google Scholar
N. Ashish and C. Knoblock, Wrapper generation for semi-structured internet sources, SIGMOD Record, 26(4), 1997. Google ScholarDigital Library
J.P. Benzecri, Correspondence Analysis Handbook, Mercel Dekker, 1992.Google Scholar
V. Chaudhri and R. Fikes, Answering Systems, the 1999 Fall Symposium. Technical Report, FS-98-04, AAAI, November 1999.Google Scholar
D. Clark, Shopbots Become Agents for Business Change, Computer, 33, pp:18--21, February 2000. Google ScholarDigital Library
M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slattery, Learning to construct knowledge bases from World Wide Web, Artificial Intelligence, 118, pp:1--2, 2000. Google ScholarDigital Library
R. Doorenbos, O. Etzioni, and D. Weld, A scalable comparison-shopping agent for the World-Wide Web, in Proc. of the First International Conference on Autonomous Agents Agents'97, pp:39--48, 1997. Google ScholarDigital Library
D. Florescu, A. Levy, and A. Mendelzon, Database Techniques for the World-Wide Web: A Survey, SIG-MOD Record, 27(3), 1998. Google ScholarDigital Library
Fujitsu, Symfoware World http://www.fujitsu.co.jp/jp/soft/symfoware/index.html, 2001.Google Scholar
S. Harabagiu, M. Pasca, and S. Maiorano, Experiments with open-domain textual question answering, in Proc. of COLING-2000, pp:292--298, 2000. Google ScholarDigital Library
B. Katz, From sentence processing to information access on the World Wide Web. in Natural Language Processing for the World Wide Web: the 1997 AAAI Spring Symposium, pp:77--94, 1999.Google Scholar
Komatsu Soft, Information Mining Tool VextSearch (in Japanese) http://www.komatsusoft.co.jp/develp/vxtsc/index.html, 2001.Google Scholar
H. Li and K. Yamanishi, Mining from open answers in questionnaire data, in Proc. of KDD 2001, pp:443--449, 2001. Google ScholarDigital Library
H. Li and K. Yamanishi, Text classification using ESC-based stochastic decision lists, Information Processing and Management, 38, pp. 343--361, 2002. Google ScholarDigital Library
K.C. Litkowski, Question-answering using semantic relation triples.in Proc. of the 8th Text Retrieval Conference (TREC-8)., pp:349--356, 1999.Google Scholar
D. Moldovan and S. Harabagiu, The structure and performance of an open-domain question answering system, in Proc. of the 38th Annual Meeting of the Association for Computational Linguistics, pp:563--570, 2000. Google ScholarDigital Library
J. Prayer, E. Brown, and A. Coden, Question-answering by predictive annotation, in Proc. of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp:184--191, 2000. Google ScholarDigital Library
J.R. Qninlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993. Google ScholarDigital Library
J. Rissanen, Fisher information and stochastic complexity, IEEE Transaction on Information Theory, 42(1), pp:40--47, 1996. Google ScholarDigital Library
D. R. Radev, J. Prager, and V. Samn, The use of predictive annotation for question answering in Proc. of the 8th Text Retrieval Conference (TREC-8), pp:399--411, 1999.Google Scholar
R. Srihari and W. Li, Information extraction supported question answering, in Proc. of the 8th Text Retrieval Conference (TREC-8), pp:185--196, 1999.Google ScholarCross Ref
K. Tateishi, Y. Ishiguro, and T. Fukushima, A reputation search engine that gathers people's opinions from the internet, (in Japanese) Technical Report NL-144-11, Information Processing Society of Japan, pp:75--82, 2001.Google Scholar
E.M. Voorhees and D.M. Tice, Building a quesdtion answering test collection, in Proc. of the 23rd Annual International ACM SIGIR Conference on Research and Development in Informtion Retrieval, pp:200--207, 2000. Google ScholarDigital Library
K. Yamanishi, A learning criterion for stochastic rules, Machine Learning, 9, pp:165--203, 1992. Google ScholarDigital Library
K. Yamanishi, A decision-theoretic extension of stochastic complexity and its applications to learning, IEEE Trans. on Infortmation Theory, 44(4), pp:1424--1439, 1998. Google ScholarDigital Library

Index Terms

Mining product reputations on the Web

Recommendations

Aspect-based opinion mining from product reviews
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

"What other people think" has always been an important piece of information for most of us during the decision-making process. Today people tend to make their opinions available to other people via the Internet. As a result, the Web has become an ...
Read More
Web opinion mining: how to extract opinions from blogs?
CSTST '08: Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology

The growing popularity of Web 2.0 provides with increasing numbers of documents expressing opinions on different topics. Recently, new research approaches have been defined in order to automatically extract such opinions from the Internet. They usually ...
Read More
Research of Product Ranking Technology Based on Opinion Mining
ICICTA '09: Proceedings of the 2009 Second International Conference on Intelligent Computation Technology and Automation - Volume 04

Since more and more users express their reviews on the web, opinion mining becomes much important. Polarity analyzing and opinion mining is the process of automatically mining polarity and opinion with computer technology. This paper focuses on mining ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
July 2002
719 pages
ISBN:158113567X
DOI:10.1145/775047
Conference Chair:
Osmar R. Zaïane
University of Alberta, Canada
,
General Chair:
Randy Goebel
University of Alberta, Canada
,
Program Chairs:
David Hand
Imperial College, UK
,
Daniel Keim
AT&T
,
Raymond Ng
University of British Columbia, Canada
Copyright © 2002 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 July 2002
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
KDD '02 Paper Acceptance Rate44of307submissions,14%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 195
  Total Citations
  View Citations
- 2,482
  Total Downloads
- Downloads (Last 12 months)21
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Mining product reputations on the Web

KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Aspect-based opinion mining from product reviews

Web opinion mining: how to extract opinions from blogs?

Research of Product Ranking Technology Based on Opinion Mining