2008 | OriginalPaper | Buchkapitel
StreamKrimp: Detecting Change in Data Streams
verfasst von : Matthijs van Leeuwen, Arno Siebes
Erschienen in: Machine Learning and Knowledge Discovery in Databases
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Data streams are ubiquitous. Examples range from sensor networks to financial transactions and website logs. In fact, even market basket data can be seen as a stream of sales. Detecting changes in the distribution a stream is sampled from is one of the most challenging problems in stream mining, as only limited storage can be used. In this paper we analyse this problem for streams of transaction data from an MDL perspective. Based on this analysis we introduce the
StreamKrimp
algorithm, whichuses the
Krimp
algorithm to characterise probability distributions with code tables. With these code tables,
StreamKrimp
partitions the stream into a sequence of substreams. Each switch of code table indicates a change in the underlying distribution. Experiments on both real and artificial streams show that
StreamKrimp
detects the changes while using only a very limited amount of data storage.