ABSTRACT
Organizations conducting Electronic Commerce (e-commerce) can greatly benefit from the insight that data mining of transactional and clickstream data provides. Such insight helps not only to improve the electronic channel (e.g., a web site), but it is also a learning vehicle for the bigger organization conducting business at brick-and-mortar stores. The e-commerce site serves as an early alert system for emerging patterns and a laboratory for experimentation. For successful data mining, several ingredients are needed and e-commerce provides all the right ones (the Good). Web server logs, which are commonly used as the source of data for mining e-commerce data, were designed to debug web servers, and the data they provide is insufficient, requiring the use of heuristics to reconstruct events. Moreover, many events are never logged in web server logs, limiting the source of data for mining (the Bad). Many of the problems of dealing with web server log data can be resolved by properly architecting the e-commerce sites to generate data needed for mining. Even with a good architecture, however, there are challenging problems that remain hard to solve (the Ugly). Lessons and metrics based on mining real e-commerce data are presented.
- 1.Peter Burrows. The Era of Efficiency. Business Week, pages 92-99, June 18 2001.]]Google Scholar
- 2.Ron Kohavi and Foster Provost. Applications of data mining to electronic commerce. Data Mining and Knowledge Discovery, 5(1/2), 2001. http://robotics.Stanford.EDU/~ronnyk/ecommerce-dm]] Google ScholarDigital Library
- 3.Yahoo! Reports First Quarter 2001 Financial Results, Press Release April 11, 2001. http://biz.yahoo.com/bw/010411/0403.html]]Google Scholar
- 4.Ron Kohavi, Carla Brodley, Brian Frasca, Llew Mason, and Zijian Zheng. KDD-Cup 2000 organizers' report: Peeling the onion. SIGKDD Explorations, 2(2):86-98, 2000. http://www.ecn.purdue.edu/KDDCUP]] Google ScholarDigital Library
- 5.Michael J. A. Berry and Gordon S. Linoff. Mastering Data Mining. John Wiley & Sons, Inc, 2000.]]Google Scholar
- 6.Paco Underhill. Why We Buy: The Science of Shopping. Touchstone Books, Rockefeller Center, 1230 Avenue of the Americas, New York, NY 10020, 2000.]] Google ScholarDigital Library
- 7.Eric Schmitt, Harley Manning, Yolanda Paul, and Joyce Tong. Measuring web success. Forrester Report, November 1999.]]Google Scholar
- 8.The Common Logfile Format. http://www.w3 .org/Daemon/U set/Con fig/Logging.html#com mon-logfile-format]]Google Scholar
- 9.Ralph Kimball and Richard Merz. The Data Webhouse Toolkit: Building the Web-Enabled Data Warehouse. John Wiley & Sons, 2000.]] Google ScholarDigital Library
- 10.Robert Cooley, Bamshad Mobasher, and Jaideep Srivastava. Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems, 1 (1), 1999.]]Google Scholar
- 11.Bettina Berendt, Bamshad Mobasher, Myra Spiliopoulou, and Jim Wiltshire. Measuring the accuracy of sessionizers for web usage analysis. In Workshop on Web Mining at the First SIAM International Conference on Data Mining, pages 7-14, April 2001.]]Google Scholar
- 12.EMetrics Study, Blue Martini Software, 2001. http://developer.bluemartini.conVdeveloper/articleslindex.jsp]]Google Scholar
- 13.J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Hypertext transfer protocol - http/l. 1. RFC 2616. http://www.w3.org/Protocols/rfc2616/rfc26 i 6.html]]Google Scholar
- 14.Robert Cooley. Drinking from the firehose: Converting raw web traffic and e-commerce data streams for data mining and marketing analysis. In Web Data Mining Conference, San Francisco, CA, 2000. http://www.webusagemining.conVsvstmpl/webdataminingworkshop]]Google Scholar
- 15.Jon Becher and Ron Kohavi. E-commerce and clickstream mining tutorial. First SIAM International Conference on Data Mining, 2001. http://robotics.Stanford.EDU/~ronnyk/miningTutorialSlides.pdf]]Google Scholar
- 16.Accrue Software Inc. Web Mining whitepaper: Driving business decisions in web time, March 2000. http://www.accrue.com/forms/webmining.html]]Google Scholar
- 17.Sane Solutions. Analyzing web site traffic, 2000. http://www.sane.com/products/NetTracker/whitepaper.pdf]]Google Scholar
- 18.Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng. Integrating e-commerce and data mining: Architecture and challenges. In WEBKDD'2000 workshop: Web Mining for E-Commerce---Challenges and Opportunities. http://robotics.Stanford.EDU/~ronnyk/WEBKDD2000/index.htmi]]Google Scholar
- 19.Robert W Cooley. Web Usage Mining: Discovery and Application of Usage Patterns from Web Data. Doctorate, University of Minnesota, 2000.]] Google ScholarDigital Library
- 20.Lara D. Catledge and James E. Pitkow. Characterizing browsing strategies in the World-Wide Web. Computer Networks and ISDN Systems, 27(6): 1065-1073, 1995.]] Google ScholarDigital Library
- 21.Stephen Gomory, Robert Hoch, Juhnyoung Lee, Mark Podlaseck, and Edith Schonberg. Analysis and visualization of metrics for on-line merchandizing. In WEBKDD99 workshop on Web Usage Analysis and User Profiling, 1999. http://www.wiwi.hu-berlin.de/m,/ra/WEBKDD99]] Google ScholarDigital Library
- 22.WebTrends Live. http://www.webtrendslive.com/]]Google Scholar
- 23.Gregory Piatetsky-Shapiro, Ron Brachman, Tom Khabaza, Willi Kloesgen, and Evangelos Simoudis. An overview of issues in developing industrial data mining and knowledge discovery applications. In Proceedings of the Second Intemational Conference on Knowledge Discovery and Data Mining, pages 89-95. AAAI Press, 1996.]]Google ScholarDigital Library
Index Terms
- Mining e-commerce data: the good, the bad, and the ugly
Recommendations
Lessons and Challenges from Mining Retail E-Commerce Data
The architecture of Blue Martini Software's e-commerce suite has supported data collection, data transformation, and data mining since its inception. With clickstreams being collected at the application-server layer, high-level events being logged, and ...
Application of Data Mining in e-Commerce
The web in recent years has been a big trend, which helped make it a source of information and essential in the various fields of research, in particular, the commercial area that represents the e-commerce electronic commerce. However, the competition ...
Data Mining for Measuring and Improving the Success of Web Sites
For many companies, competitiveness in e-commerce requires a successful presence on the web. Web sites are used to establish the company's image, to promote and sell goods and to provide customer support. The success of a web site affects and reflects ...
Comments