Skip to main content

2002 | OriginalPaper | Buchkapitel

DELISP: Efficient Discovery of Generalized Sequential Patterns by Delimited Pattern-Growth Technology

verfasst von : Ming-Yen Lin, Suh-Yin Lee, Sheng-Shun Wang

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

An active research in data mining is the discovery of sequential patterns, which finds all frequent sub-sequences in a sequence database. Most of the studies specify no time constraints such as maximum/minimum gaps between adjacent elements of a pattern in the mining so that the resultant patterns may be uninteresting. In addition, a data sequence containing a pattern is rigidly defined as only when each element of the pattern is contained in a distinct element of the sequence. This limitation might lose useful patterns for some applications because sometimes items of an element might be spread across adjoining elements within a specified time period or time window. Therefore, we propose a pattern-growth approach for mining the generalized sequential patterns. Our approach features in reducing the size of sub-databases by bounded and windowed projection techniques. Bounded projections keep only time-gap valid sub-sequences and windowed projections save non-redundant sub-sequences satisfying the sliding time window constraint. Furthermore, the delimited growth technique directly generates constraint-satisfactory patterns and speeds up the growing process. The empirical evaluations show that the proposed approach has good linear scalability and outperforms the well-known GSP algorithm in the discovery of generalized sequential patterns.

Metadaten
Titel
DELISP: Efficient Discovery of Generalized Sequential Patterns by Delimited Pattern-Growth Technology
verfasst von
Ming-Yen Lin
Suh-Yin Lee
Sheng-Shun Wang
Copyright-Jahr
2002
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-47887-6_19

Neuer Inhalt