2011 | OriginalPaper | Buchkapitel
A Novel Approach to Cluster Web Traversal Patterns Based on Edit Distance
verfasst von : Xiaoqiu Tan, Miaojun Xu
Erschienen in: Emerging Research in Web Information Systems and Mining
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Edit distance, as a similarity measure between user traversal patterns, satisfies the need of varying-length of user traversal sequences very well because it can be computed between different-length symbol strings which needs lower time and storage expense. Moreover, web topology is skillfully used to compute the relationship between pages which is used as a measure of cost of an edit operation. Finally, two-threshold sequential clustering method (TTSCM) is used to cluster user traversal patterns avoiding specifying the number of cluster in advance, and reducing the dependency between the clustering results and the clustering order of traversal patterns. Experimental results test and verify the effectiveness and flexibility of our proposed methods.