skip to main content
10.1145/1013367.1013497acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
Article

On mining webclick streams for path traversal patterns

Published:19 May 2004Publication History

ABSTRACT

Mining user access patterns from a continuous stream of Web-clicks presents new challenges over traditional Web usage mining in a large static Web-click database. Modeling user access patterns as maximal forward references, we present a single-pass algorithm StreamPath for online discovering frequent path traversal patterns from an extended prefix tree-based data structure which stores the compressed and essential information about user's moving histories in the stream. Theoretical analysis and performance evaluation show that the space requirement of StreamPath is limited to a logarithmic boundary, and the execution time, compared with previous multiple-pass algorithms [2], is fast.

References

  1. Babcock, B., Babu, S., Datar, M., Motwani, R., and Widom, J. Models and Issues in Data Stream Systems. In Proc. of the 2002 ACM Symposium on Principles of Database Systems, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Chen, M.-S., Park, J.-S., and Yu, P. S. Efficient Data Mining for Path Traversal Patterns, IEEE Transactions on Knowledge and Data Engineering (TKDE), 10(2):209--221, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Han, J., Pei, J., Yin, Y., and Mao, R. Mining Frequent Patterns without Candidate Generation: A Frequent-pattern Tree Approach. Data Mining and Knowledge Discovery: An International Journal, 8(1):53--87, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Karp, R., Shenker, S., and Papadimitriou, C. A Simple Algorithm for Finding Frequent Elements in Streams and Bags. ACM Transactions on Database Systems (TODS), 28(1):51--55, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. On mining webclick streams for path traversal patterns

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WWW Alt. '04: Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
      May 2004
      532 pages
      ISBN:1581139128
      DOI:10.1145/1013367

      Copyright © 2004 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 19 May 2004

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader