skip to main content
10.1145/502512.502518acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Mining e-commerce data: the good, the bad, and the ugly

Published:26 August 2001Publication History

ABSTRACT

Organizations conducting Electronic Commerce (e-commerce) can greatly benefit from the insight that data mining of transactional and clickstream data provides. Such insight helps not only to improve the electronic channel (e.g., a web site), but it is also a learning vehicle for the bigger organization conducting business at brick-and-mortar stores. The e-commerce site serves as an early alert system for emerging patterns and a laboratory for experimentation. For successful data mining, several ingredients are needed and e-commerce provides all the right ones (the Good). Web server logs, which are commonly used as the source of data for mining e-commerce data, were designed to debug web servers, and the data they provide is insufficient, requiring the use of heuristics to reconstruct events. Moreover, many events are never logged in web server logs, limiting the source of data for mining (the Bad). Many of the problems of dealing with web server log data can be resolved by properly architecting the e-commerce sites to generate data needed for mining. Even with a good architecture, however, there are challenging problems that remain hard to solve (the Ugly). Lessons and metrics based on mining real e-commerce data are presented.

References

  1. 1.Peter Burrows. The Era of Efficiency. Business Week, pages 92-99, June 18 2001.]]Google ScholarGoogle Scholar
  2. 2.Ron Kohavi and Foster Provost. Applications of data mining to electronic commerce. Data Mining and Knowledge Discovery, 5(1/2), 2001. http://robotics.Stanford.EDU/~ronnyk/ecommerce-dm]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.Yahoo! Reports First Quarter 2001 Financial Results, Press Release April 11, 2001. http://biz.yahoo.com/bw/010411/0403.html]]Google ScholarGoogle Scholar
  4. 4.Ron Kohavi, Carla Brodley, Brian Frasca, Llew Mason, and Zijian Zheng. KDD-Cup 2000 organizers' report: Peeling the onion. SIGKDD Explorations, 2(2):86-98, 2000. http://www.ecn.purdue.edu/KDDCUP]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.Michael J. A. Berry and Gordon S. Linoff. Mastering Data Mining. John Wiley & Sons, Inc, 2000.]]Google ScholarGoogle Scholar
  6. 6.Paco Underhill. Why We Buy: The Science of Shopping. Touchstone Books, Rockefeller Center, 1230 Avenue of the Americas, New York, NY 10020, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.Eric Schmitt, Harley Manning, Yolanda Paul, and Joyce Tong. Measuring web success. Forrester Report, November 1999.]]Google ScholarGoogle Scholar
  8. 8.The Common Logfile Format. http://www.w3 .org/Daemon/U set/Con fig/Logging.html#com mon-logfile-format]]Google ScholarGoogle Scholar
  9. 9.Ralph Kimball and Richard Merz. The Data Webhouse Toolkit: Building the Web-Enabled Data Warehouse. John Wiley & Sons, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.Robert Cooley, Bamshad Mobasher, and Jaideep Srivastava. Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems, 1 (1), 1999.]]Google ScholarGoogle Scholar
  11. 11.Bettina Berendt, Bamshad Mobasher, Myra Spiliopoulou, and Jim Wiltshire. Measuring the accuracy of sessionizers for web usage analysis. In Workshop on Web Mining at the First SIAM International Conference on Data Mining, pages 7-14, April 2001.]]Google ScholarGoogle Scholar
  12. 12.EMetrics Study, Blue Martini Software, 2001. http://developer.bluemartini.conVdeveloper/articleslindex.jsp]]Google ScholarGoogle Scholar
  13. 13.J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Hypertext transfer protocol - http/l. 1. RFC 2616. http://www.w3.org/Protocols/rfc2616/rfc26 i 6.html]]Google ScholarGoogle Scholar
  14. 14.Robert Cooley. Drinking from the firehose: Converting raw web traffic and e-commerce data streams for data mining and marketing analysis. In Web Data Mining Conference, San Francisco, CA, 2000. http://www.webusagemining.conVsvstmpl/webdataminingworkshop]]Google ScholarGoogle Scholar
  15. 15.Jon Becher and Ron Kohavi. E-commerce and clickstream mining tutorial. First SIAM International Conference on Data Mining, 2001. http://robotics.Stanford.EDU/~ronnyk/miningTutorialSlides.pdf]]Google ScholarGoogle Scholar
  16. 16.Accrue Software Inc. Web Mining whitepaper: Driving business decisions in web time, March 2000. http://www.accrue.com/forms/webmining.html]]Google ScholarGoogle Scholar
  17. 17.Sane Solutions. Analyzing web site traffic, 2000. http://www.sane.com/products/NetTracker/whitepaper.pdf]]Google ScholarGoogle Scholar
  18. 18.Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng. Integrating e-commerce and data mining: Architecture and challenges. In WEBKDD'2000 workshop: Web Mining for E-Commerce---Challenges and Opportunities. http://robotics.Stanford.EDU/~ronnyk/WEBKDD2000/index.htmi]]Google ScholarGoogle Scholar
  19. 19.Robert W Cooley. Web Usage Mining: Discovery and Application of Usage Patterns from Web Data. Doctorate, University of Minnesota, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20.Lara D. Catledge and James E. Pitkow. Characterizing browsing strategies in the World-Wide Web. Computer Networks and ISDN Systems, 27(6): 1065-1073, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21.Stephen Gomory, Robert Hoch, Juhnyoung Lee, Mark Podlaseck, and Edith Schonberg. Analysis and visualization of metrics for on-line merchandizing. In WEBKDD99 workshop on Web Usage Analysis and User Profiling, 1999. http://www.wiwi.hu-berlin.de/m,/ra/WEBKDD99]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. 22.WebTrends Live. http://www.webtrendslive.com/]]Google ScholarGoogle Scholar
  23. 23.Gregory Piatetsky-Shapiro, Ron Brachman, Tom Khabaza, Willi Kloesgen, and Evangelos Simoudis. An overview of issues in developing industrial data mining and knowledge discovery applications. In Proceedings of the Second Intemational Conference on Knowledge Discovery and Data Mining, pages 89-95. AAAI Press, 1996.]]Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Mining e-commerce data: the good, the bad, and the ugly

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
        August 2001
        493 pages
        ISBN:158113391X
        DOI:10.1145/502512

        Copyright © 2001 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 August 2001

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        KDD '01 Paper Acceptance Rate31of237submissions,13%Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader