Skip to main content

2014 | OriginalPaper | Buchkapitel

9. Frequent Pattern Mining in Data Streams

verfasst von : Victor E. Lee, Ruoming Jin, Gagan Agrawal

Erschienen in: Frequent Pattern Mining

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As the volume of digital commerce and communication has exploded, the demand for data mining of streaming data has likewise grown. One of the fundamental data mining tasks, for both static and streaming data, is frequent pattern mining. The goal of pattern mining is to identity frequently occurring patterns and structures. Such patterns may indicate scientific phenomena, economic or social trends, or even security threats. Moreover, not only is pattern discovery important by itself, but it is also a building block for machine learning tasks such as association rule induction. Traditionally, algorithms for pattern discovery have processed the entire dataset as a batch, with no restriction on how many passes through the data would be taken.
However, when the data are arriving in a continuous and unending stream, our algorithm must be limited to a single pass. Moreover, the length of the stream is indeterminate, so we cannot wait for it to end. We generate an initial result after seeing a certain quantity of data, and then we periodically revise the result. A particular challenge for frequent pattern discovery is the combinatorial explosion of candidate patterns
In this chapter, we present a structured review of online frequent pattern mining techniques. We classify the methods according to the type of pattern and data, the time window being considered, and the quality of the approximation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. of Int. conf. Very Large DataBases (VLDB'94), pages 487–499, Santiago, Chile, September 1994. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. of Int. conf. Very Large DataBases (VLDB'94), pages 487–499, Santiago, Chile, September 1994.
2.
Zurück zum Zitat Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, VLDB '94, pages 487–499, San Francisco, CA, USA, 1994. Morgan Kaufmann Publishers Inc. Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, VLDB '94, pages 487–499, San Francisco, CA, USA, 1994. Morgan Kaufmann Publishers Inc.
3.
Zurück zum Zitat Rakesh Agrawal and Ramakrishnan Srikant. Mining sequential patterns. In Data Engineering, 1995. Proceedings of the Eleventh International Conference on, pages 3–14. IEEE, 1995. Rakesh Agrawal and Ramakrishnan Srikant. Mining sequential patterns. In Data Engineering, 1995. Proceedings of the Eleventh International Conference on, pages 3–14. IEEE, 1995.
4.
Zurück zum Zitat R. Agrawal, H. Mannila, R. Srikant, H. Toivonent, and A. Inkeri Verkamo. Fast discovery of association rules. In U. Fayyad and et al, editors, Advances in Knowledge Discovery and Data Mining, pages 307–328. AAAI Press, Menlo Park, CA, 1996. R. Agrawal, H. Mannila, R. Srikant, H. Toivonent, and A. Inkeri Verkamo. Fast discovery of association rules. In U. Fayyad and et al, editors, Advances in Knowledge Discovery and Data Mining, pages 307–328. AAAI Press, Menlo Park, CA, 1996.
5.
Zurück zum Zitat Charu C. Aggarwal, Yao Li, Philip S. Yu, and Ruoming Jin. On dense pattern mining in graph streams. Proc. VLDB Endow., 3(1–2):975–984, September 2010.CrossRef Charu C. Aggarwal, Yao Li, Philip S. Yu, and Ruoming Jin. On dense pattern mining in graph streams. Proc. VLDB Endow., 3(1–2):975–984, September 2010.CrossRef
6.
Zurück zum Zitat Tatsuya Asai, Hiroki Arimura, Kenji Abe, Shinji Kawasoe, and Setsuo Arikawa. Online algorithms for mining semi-structured data stream. In Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on, pages 27–34. IEEE, 2002. Tatsuya Asai, Hiroki Arimura, Kenji Abe, Shinji Kawasoe, and Setsuo Arikawa. Online algorithms for mining semi-structured data stream. In Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on, pages 27–34. IEEE, 2002.
7.
Zurück zum Zitat Tatsuya Asai, Kenji Abe, Shinji Kawasoe, Hiroki Arimura, and Setsuo Arikawa. Efficient algorithms for finding frequent substructures from semi-structured data streams. In New Frontiers in Artificial Intelligence, pages 29–45. Springer, 2007. Tatsuya Asai, Kenji Abe, Shinji Kawasoe, Hiroki Arimura, and Setsuo Arikawa. Efficient algorithms for finding frequent substructures from semi-structured data streams. In New Frontiers in Artificial Intelligence, pages 29–45. Springer, 2007.
8.
Zurück zum Zitat B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and Issues in Data Stream Systems. In Proceedings of the 2002 ACM Symposium on Principles of Database Systems (PODS 2002) (Invited Paper). ACM Press, June 2002. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and Issues in Data Stream Systems. In Proceedings of the 2002 ACM Symposium on Principles of Database Systems (PODS 2002) (Invited Paper). ACM Press, June 2002.
9.
Zurück zum Zitat Albert Bifet. Adaptive stream mining: Pattern learning and mining from evolving data streams. In Proceedings of the 2010 conference on Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams, pages 1–212, Amsterdam, The Netherlands, The Netherlands, 2010. IOS Press. Albert Bifet. Adaptive stream mining: Pattern learning and mining from evolving data streams. In Proceedings of the 2010 conference on Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams, pages 1–212, Amsterdam, The Netherlands, The Netherlands, 2010. IOS Press.
10.
Zurück zum Zitat Albert Bifet and Ricard Gavaldà. Mining adaptively frequent closed unlabeled rooted trees in data streams. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '08, pages 34–42, New York, NY, USA, 2008. ACM. Albert Bifet and Ricard Gavaldà. Mining adaptively frequent closed unlabeled rooted trees in data streams. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '08, pages 34–42, New York, NY, USA, 2008. ACM.
11.
Zurück zum Zitat Albert Bifet, Geoff Holmes, Bernhard Pfahringer, and Ricard Gavaldà. Mining frequent closed graphs on evolving data streams. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '11, pages 591–599, New York, NY, USA, 2011. ACM. Albert Bifet, Geoff Holmes, Bernhard Pfahringer, and Ricard Gavaldà. Mining frequent closed graphs on evolving data streams. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '11, pages 591–599, New York, NY, USA, 2011. ACM.
12.
Zurück zum Zitat \refauHervé Brönnimann, Bin Chen, Manoranjan Dash, Peter Haas, and Peter Scheuermann. Efficient data reduction with ease. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '03, pages 59–68, New York, NY, USA, 2003. ACM. \refauHervé Brönnimann, Bin Chen, Manoranjan Dash, Peter Haas, and Peter Scheuermann. Efficient data reduction with ease. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '03, pages 59–68, New York, NY, USA, 2003. ACM.
13.
Zurück zum Zitat Toon Calders, Nele Dexters, and Bart Goethals. Mining frequent itemsets in a stream. In Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on, pages 83–92. IEEE, 2007. Toon Calders, Nele Dexters, and Bart Goethals. Mining frequent itemsets in a stream. In Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on, pages 83–92. IEEE, 2007.
14.
Zurück zum Zitat Joong Hyuk Chang and Won Suk Lee. Finding recent frequent itemsets adaptively over online data streams. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '03, pages 487–492, New York, NY, USA, 2003. ACM. Joong Hyuk Chang and Won Suk Lee. Finding recent frequent itemsets adaptively over online data streams. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '03, pages 487–492, New York, NY, USA, 2003. ACM.
15.
Zurück zum Zitat Joong Hyuk Chang and Won Suk Lee. Efficient mining method for retrieving sequential patterns over online data streams. J. Inf. Sci., 31(5):420–432, October 2005.CrossRef Joong Hyuk Chang and Won Suk Lee. Efficient mining method for retrieving sequential patterns over online data streams. J. Inf. Sci., 31(5):420–432, October 2005.CrossRef
16.
Zurück zum Zitat Lei Chang, Tengjiao Wang, Dongqing Yang, and Hua Luan. Seqstream: Mining closed sequential patterns over stream sliding windows. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, ICDM '08, pages 83–92, Washington, DC, USA, 2008. IEEE Computer Society. Lei Chang, Tengjiao Wang, Dongqing Yang, and Hua Luan. Seqstream: Mining closed sequential patterns over stream sliding windows. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, ICDM '08, pages 83–92, Washington, DC, USA, 2008. IEEE Computer Society.
17.
Zurück zum Zitat Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. In Automata, Languages and Programming, pages 693–703. Springer, 2002. Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. In Automata, Languages and Programming, pages 693–703. Springer, 2002.
18.
Zurück zum Zitat Bin Chen, Peter Haas, and Peter Scheuermann. A new two-phase sampling based algorithm for discovering association rules. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '02, pages 462–468, New York, NY, USA, 2002. ACM. Bin Chen, Peter Haas, and Peter Scheuermann. A new two-phase sampling based algorithm for discovering association rules. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '02, pages 462–468, New York, NY, USA, 2002. ACM.
19.
Zurück zum Zitat Junbo Chen and ShanPing Li. Gc-tree:a fast online algorithm for mining frequent closed itemsets. In Emerging Technologies in Knowledge Discovery and Data Mining, pages 457–468. Springer, 2007. Junbo Chen and ShanPing Li. Gc-tree:a fast online algorithm for mining frequent closed itemsets. In Emerging Technologies in Knowledge Discovery and Data Mining, pages 457–468. Springer, 2007.
20.
Zurück zum Zitat James Cheng, Yiping Ke, and Wilfred Ng. Maintaining frequent closed itemsets over a sliding window. Journal of Intelligent Information Systems, 31(3):191–215, 2008.CrossRef James Cheng, Yiping Ke, and Wilfred Ng. Maintaining frequent closed itemsets over a sliding window. Journal of Intelligent Information Systems, 31(3):191–215, 2008.CrossRef
21.
Zurück zum Zitat William Cheung and Osmar R Zaiane. Incremental mining of frequent patterns without candidate generation or support constraint. In Database Engineering and Applications Symposium, 2003. Proceedings. Seventh International, pages 111–116. IEEE, 2003. William Cheung and Osmar R Zaiane. Incremental mining of frequent patterns without candidate generation or support constraint. In Database Engineering and Applications Symposium, 2003. Proceedings. Seventh International, pages 111–116. IEEE, 2003.
22.
Zurück zum Zitat D.W. Cheung, J. Han, V. Ng, and C.Y. Wong. Maintenance of discovered association rules in large databases: An incremental updating techniques. In Proc. 12th IEEE International Conference on Data Engineering (ICDE-96), New Orleans, Louisiana, U.S.A., March 1, 1996. D.W. Cheung, J. Han, V. Ng, and C.Y. Wong. Maintenance of discovered association rules in large databases: An incremental updating techniques. In Proc. 12th IEEE International Conference on Data Engineering (ICDE-96), New Orleans, Louisiana, U.S.A., March 1, 1996.
23.
Zurück zum Zitat Yun Chi, Haixun Wang, Philip S. Yu, and Richard R. Muntz. Moment: Maintaining closed frequent itemsets over a stream sliding window. In Proceedings of the Fourth IEEE International Conference on Data Mining, ICDM '04, pages 59–66, Washington, DC, USA, 2004. IEEE Computer Society. Yun Chi, Haixun Wang, Philip S. Yu, and Richard R. Muntz. Moment: Maintaining closed frequent itemsets over a stream sliding window. In Proceedings of the Fourth IEEE International Conference on Data Mining, ICDM '04, pages 59–66, Washington, DC, USA, 2004. IEEE Computer Society.
24.
Zurück zum Zitat Graham Cormode and S. Muthukrishnan. An improved data stream summary: The count-min sketch and its applications. J. Algorithms, 55(1):58–75, April 2005.CrossRefMATHMathSciNet Graham Cormode and S. Muthukrishnan. An improved data stream summary: The count-min sketch and its applications. J. Algorithms, 55(1):58–75, April 2005.CrossRefMATHMathSciNet
25.
Zurück zum Zitat Erik D. Demaine, Alejandro López-Ortiz, and J. Ian Munro. Frequency estimation of internet packet streams with limited space. In Proceedings of the 10th Annual European Symposium on Algorithms, ESA '02, pages 348–360, London, UK, UK, 2002. Springer-Verlag. Erik D. Demaine, Alejandro López-Ortiz, and J. Ian Munro. Frequency estimation of internet packet streams with limited space. In Proceedings of the 10th Annual European Symposium on Algorithms, ESA '02, pages 348–360, London, UK, UK, 2002. Springer-Verlag.
26.
Zurück zum Zitat C. I. Ezeife and Yi Lu. Mining web log sequential patterns with position coded pre-order linked wap-tree. Data Min. Knowl. Discov., 10(1):5–38, January 2005.CrossRefMathSciNet C. I. Ezeife and Yi Lu. Mining web log sequential patterns with position coded pre-order linked wap-tree. Data Min. Knowl. Discov., 10(1):5–38, January 2005.CrossRefMathSciNet
27.
Zurück zum Zitat CI Ezeife and Mostafa Monwar. Ssm: a frequent sequential data stream patterns miner. In Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on, pages 120–126. IEEE, 2007. CI Ezeife and Mostafa Monwar. Ssm: a frequent sequential data stream patterns miner. In Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on, pages 120–126. IEEE, 2007.
28.
Zurück zum Zitat Chris Giannella, Jiawei Han, Jian Pei, Xifeng Yan, and Philip S Yu. Mining frequent patterns in data streams at multiple time granularities. Next generation data mining, 212:191–212, 2003. Chris Giannella, Jiawei Han, Jian Pei, Xifeng Yan, and Philip S Yu. Mining frequent patterns in data streams at multiple time granularities. Next generation data mining, 212:191–212, 2003.
29.
Zurück zum Zitat Phillip B. Gibbons and Yossi Matias. New Sampling-Based Summary Statistics for Improving Approximate Query Answers. In Proc. of the 1998 ACM SIGMOD, pages 331–342. ACM Press, June 1998. Phillip B. Gibbons and Yossi Matias. New Sampling-Based Summary Statistics for Improving Approximate Query Answers. In Proc. of the 1998 ACM SIGMOD, pages 331–342. ACM Press, June 1998.
30.
Zurück zum Zitat Bart Goethals and Mohammed J. Zaki. Workshop Report on Workshop on Frequent Itemset Mining Implementations (FIMI). 2003. Bart Goethals and Mohammed J. Zaki. Workshop Report on Workshop on Frequent Itemset Mining Implementations (FIMI). 2003.
31.
Zurück zum Zitat Anamika Gupta, Vasudha Bhatnagar, and Naveen Kumar. Mining closed itemsets in data stream using formal concept analysis. In Proceedings of the 12th international conference on Data warehousing and knowledge discovery, DaWaK'10, pages 285–296, Berlin, Heidelberg, 2010. Springer-Verlag. Anamika Gupta, Vasudha Bhatnagar, and Naveen Kumar. Mining closed itemsets in data stream using formal concept analysis. In Proceedings of the 12th international conference on Data warehousing and knowledge discovery, DaWaK'10, pages 285–296, Berlin, Heidelberg, 2010. Springer-Verlag.
32.
Zurück zum Zitat J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proceedings of the ACM SIGMOD Conference on Management of Data, 2000. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proceedings of the ACM SIGMOD Conference on Management of Data, 2000.
33.
Zurück zum Zitat Jiawei Han, Hong Cheng, Dong Xin, and Xifeng Yan. Frequent pattern mining: current status and future directions. Data Mining and Knowledge Discovery, 15(1):55–86, 2007.CrossRefMathSciNet Jiawei Han, Hong Cheng, Dong Xin, and Xifeng Yan. Frequent pattern mining: current status and future directions. Data Mining and Knowledge Discovery, 15(1):55–86, 2007.CrossRefMathSciNet
34.
Zurück zum Zitat Chandima HewaNadungodage, Yuni Xia, Jaehwan John Lee, and Yi-cheng Tu. Hyper-structure mining of frequent patterns in uncertain data streams. Knowledge and Information Systems, 37(1):219–244, 2013.CrossRef Chandima HewaNadungodage, Yuni Xia, Jaehwan John Lee, and Yi-cheng Tu. Hyper-structure mining of frequent patterns in uncertain data streams. Knowledge and Information Systems, 37(1):219–244, 2013.CrossRef
35.
Zurück zum Zitat C. Hidber. Online Association Rule Mining. In Proceedings of ACM SIGMOD Conference on Management of Data, pages 145–156. ACM Press, 1999. C. Hidber. Online Association Rule Mining. In Proceedings of ACM SIGMOD Conference on Management of Data, pages 145–156. ACM Press, 1999.
36.
Zurück zum Zitat Jochen Hipp, Ulrich Güntzer, and Gholamreza Nakhaeizadeh. Algorithms for association rule mining—a general survey and comparison. SIGKDD Explor. Newsl., 2(1):58–64, June 2000. Jochen Hipp, Ulrich Güntzer, and Gholamreza Nakhaeizadeh. Algorithms for association rule mining—a general survey and comparison. SIGKDD Explor. Newsl., 2(1):58–64, June 2000.
37.
Zurück zum Zitat Jun Huan, Wei Wang, Deepak Bandyopadhyay, Jack Snoeyink, Jan Prins, and Alexander Tropsha. Mining protein family-specific residue packing patterns from protein structure graphs. In Eighth International Conference on Research in Computational Molecular Biology (RECOMB), pages 308–315, 2004. Jun Huan, Wei Wang, Deepak Bandyopadhyay, Jack Snoeyink, Jan Prins, and Alexander Tropsha. Mining protein family-specific residue packing patterns from protein structure graphs. In Eighth International Conference on Research in Computational Molecular Biology (RECOMB), pages 308–315, 2004.
38.
Zurück zum Zitat Akihiro Inokuchi, Takashi Washio, and Hiroshi Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In Principles of Knowledge Discovery and Data Mining (PKDD2000), pages 13–23, 2000. Akihiro Inokuchi, Takashi Washio, and Hiroshi Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In Principles of Knowledge Discovery and Data Mining (PKDD2000), pages 13–23, 2000.
39.
Zurück zum Zitat Nan Jiang and Le Gruenwald. Cfi-stream: mining closed frequent itemsets in data streams. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '06, pages 592–597, New York, NY, USA, 2006. ACM. Nan Jiang and Le Gruenwald. Cfi-stream: mining closed frequent itemsets in data streams. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '06, pages 592–597, New York, NY, USA, 2006. ACM.
40.
Zurück zum Zitat Ruoming Jin and Gagan Agrawal. An algorithm for in-core frequent itemset mining on streaming data. In Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM '05, pages 210–217, Washington, DC, USA, 2005. IEEE Computer Society. Ruoming Jin and Gagan Agrawal. An algorithm for in-core frequent itemset mining on streaming data. In Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM '05, pages 210–217, Washington, DC, USA, 2005. IEEE Computer Society.
41.
Zurück zum Zitat Ruoming Jin, Chao Wang, Dmitrii Polshakov, Srini Parthasarathy, and Gagan Agrawal. Discovering frequent topological structures from graph datasets. In KDD, 2005. Ruoming Jin, Chao Wang, Dmitrii Polshakov, Srini Parthasarathy, and Gagan Agrawal. Discovering frequent topological structures from graph datasets. In KDD, 2005.
43.
Zurück zum Zitat Adam Koper and Hung Son Nguyen. Sequential pattern mining from stream data. In Advanced Data Mining and Applications, pages 278–291. Springer, 2011. Adam Koper and Hung Son Nguyen. Sequential pattern mining from stream data. In Advanced Data Mining and Applications, pages 278–291. Springer, 2011.
44.
Zurück zum Zitat Michihiro Kuramochi and George Karypis. Frequent subgraph discovery. In ICDM '01: Proceedings of the 2001 IEEE International Conference on Data Mining, pages 313–320, 2001. Michihiro Kuramochi and George Karypis. Frequent subgraph discovery. In ICDM '01: Proceedings of the 2001 IEEE International Conference on Data Mining, pages 313–320, 2001.
45.
Zurück zum Zitat Daesu Lee and Wonsuk Lee. Finding maximal frequent itemsets over online data streams adaptively. In Data Mining, Fifth IEEE International Conference on, pages 8–pp. IEEE, 2005. Daesu Lee and Wonsuk Lee. Finding maximal frequent itemsets over online data streams adaptively. In Data Mining, Fifth IEEE International Conference on, pages 8–pp. IEEE, 2005.
46.
Zurück zum Zitat CK-S Leung and Boyu Hao. Mining of frequent itemsets from streams of uncertain data. In Data Engineering, 2009. ICDE'09. IEEE 25th International Conference on, pages 1663–1670. IEEE, 2009. CK-S Leung and Boyu Hao. Mining of frequent itemsets from streams of uncertain data. In Data Engineering, 2009. ICDE'09. IEEE 25th International Conference on, pages 1663–1670. IEEE, 2009.
47.
Zurück zum Zitat Carson Kai-Sang Leung and Fan Jiang. Frequent itemset mining of uncertain data streams using the damped window model. In Proceedings of the 2011 ACM Symposium on Applied Computing, SAC '11, pages 950–955, New York, NY, USA, 2011. ACM. Carson Kai-Sang Leung and Fan Jiang. Frequent itemset mining of uncertain data streams using the damped window model. In Proceedings of the 2011 ACM Symposium on Applied Computing, SAC '11, pages 950–955, New York, NY, USA, 2011. ACM.
48.
Zurück zum Zitat Carson Kai-Sang Leung and Fan Jiang. Frequent pattern mining from time-fading streams of uncertain data. In Data Warehousing and Knowledge Discovery, pages 252–264. Springer, 2011. Carson Kai-Sang Leung and Fan Jiang. Frequent pattern mining from time-fading streams of uncertain data. In Data Warehousing and Knowledge Discovery, pages 252–264. Springer, 2011.
49.
Zurück zum Zitat Hua-Fu Li and Suh-Yin Lee. Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Systems with Applications, 36(2):1466–1477, 2009.CrossRef Hua-Fu Li and Suh-Yin Lee. Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Systems with Applications, 36(2):1466–1477, 2009.CrossRef
50.
Zurück zum Zitat Haifeng Li and Ning Zhang. A false negative maximal frequent itemset mining algorithm over stream. In Advanced Data Mining and Applications, pages 29–41. Springer, 2011. Haifeng Li and Ning Zhang. A false negative maximal frequent itemset mining algorithm over stream. In Advanced Data Mining and Applications, pages 29–41. Springer, 2011.
51.
Zurück zum Zitat Wenmin Li, Jiawei Han, and Jian Pei. Cmar: Accurate and efficient classification based on multiple class-association rules. In Proceedings of the 2001 IEEE International Conference on Data Mining, ICDM '01, pages 369–376, Washington, DC, USA, 2001. IEEE Computer Society. Wenmin Li, Jiawei Han, and Jian Pei. Cmar: Accurate and efficient classification based on multiple class-association rules. In Proceedings of the 2001 IEEE International Conference on Data Mining, ICDM '01, pages 369–376, Washington, DC, USA, 2001. IEEE Computer Society.
52.
Zurück zum Zitat Hua-Fu Li, Suh-Yin Lee, and Man-Kwan Shan. An efficient algorithm for mining frequent itemsets over the entire history of data streams. In Proc. of First International Workshop on Knowledge Discovery in Data Streams, 2004. Hua-Fu Li, Suh-Yin Lee, and Man-Kwan Shan. An efficient algorithm for mining frequent itemsets over the entire history of data streams. In Proc. of First International Workshop on Knowledge Discovery in Data Streams, 2004.
53.
Zurück zum Zitat Hua-Fu Li, Suh-Yin Lee, and Man-Kwan Shan. Online mining (recently) maximal frequent itemsets over data streams. In Research Issues in Data Engineering: Stream Data Mining and Applications, 2005. RIDE-SDMA 2005. 15th International Workshop on, pages 11–18. IEEE, 2005. Hua-Fu Li, Suh-Yin Lee, and Man-Kwan Shan. Online mining (recently) maximal frequent itemsets over data streams. In Research Issues in Data Engineering: Stream Data Mining and Applications, 2005. RIDE-SDMA 2005. 15th International Workshop on, pages 11–18. IEEE, 2005.
54.
Zurück zum Zitat Hua-Fu Li, Man-Kwan Shan, and Suh-Yin Lee. Online mining of frequent query trees over xml data streams. In Proceedings of the 15th international conference on World Wide Web, pages 959–960. ACM, 2006. Hua-Fu Li, Man-Kwan Shan, and Suh-Yin Lee. Online mining of frequent query trees over xml data streams. In Proceedings of the 15th international conference on World Wide Web, pages 959–960. ACM, 2006.
55.
Zurück zum Zitat Hua-Fu Li, Man-Kwan Shan, and Suh-Yin Lee. Dsm-fi: an efficient algorithm for mining frequent itemsets in data streams. Knowledge and Information Systems, 17(1):79–97, 2008.CrossRef Hua-Fu Li, Man-Kwan Shan, and Suh-Yin Lee. Dsm-fi: an efficient algorithm for mining frequent itemsets in data streams. Knowledge and Information Systems, 17(1):79–97, 2008.CrossRef
56.
Zurück zum Zitat Hua-Fu Li, Chin-Chuan Ho, and Suh-Yin Lee. Incremental updates of closed frequent itemsets over continuous data streams. Expert Systems with Applications, 36(2):2451–2458, 2009.CrossRef Hua-Fu Li, Chin-Chuan Ho, and Suh-Yin Lee. Incremental updates of closed frequent itemsets over continuous data streams. Expert Systems with Applications, 36(2):2451–2458, 2009.CrossRef
57.
Zurück zum Zitat Haifeng Li, Ning Zhang, and Zhixin Chen. A simple but effective maximal frequent itemset mining algorithm over streams. Journal of Software, 7(1):25–32, 2012. Haifeng Li, Ning Zhang, and Zhixin Chen. A simple but effective maximal frequent itemset mining algorithm over streams. Journal of Software, 7(1):25–32, 2012.
58.
Zurück zum Zitat Chih-Hsiang Lin, Ding-Ying Chiu, Yi-Hung Wu, and Arbee L.P. Chen. Mining frequent itemsets from data streams with a time-sensitive sliding window. In Proceedings of the Fifth SIAM International Conference on Data Mining, volume 119, page 68. SIAM, 2005. Chih-Hsiang Lin, Ding-Ying Chiu, Yi-Hung Wu, and Arbee L.P. Chen. Mining frequent itemsets from data streams with a time-sensitive sliding window. In Proceedings of the Fifth SIAM International Conference on Data Mining, volume 119, page 68. SIAM, 2005.
59.
Zurück zum Zitat Xuejun Liu, Jihong Guan, and Ping Hu. Mining frequent closed itemsets from a landmark window over online data streams. Comput. Math. Appl., 57(6):927–936, March 2009.CrossRefMATH Xuejun Liu, Jihong Guan, and Ping Hu. Mining frequent closed itemsets from a landmark window over online data streams. Comput. Math. Appl., 57(6):927–936, March 2009.CrossRefMATH
60.
Zurück zum Zitat G. S. Manku and R. Motwani. Approximate Frequency Counts Over Data Streams. In Proceedings of Conference on Very Large DataBases (VLDB), pages 346–357, September 2002. G. S. Manku and R. Motwani. Approximate Frequency Counts Over Data Streams. In Proceedings of Conference on Very Large DataBases (VLDB), pages 346–357, September 2002.
61.
Zurück zum Zitat Gurmeet Singh Manku and Rajeev Motwani. Approximate frequency counts over data streams. In Proceedings of the 28th international conference on Very Large Data Bases, VLDB '02, pages 346–357. VLDB Endowment, 2002. Gurmeet Singh Manku and Rajeev Motwani. Approximate frequency counts over data streams. In Proceedings of the 28th international conference on Very Large Data Bases, VLDB '02, pages 346–357. VLDB Endowment, 2002.
62.
Zurück zum Zitat Alice Marascu and Florent Masseglia. Mining sequential patterns from data streams: A centroid approach. J. Intell. Inf. Syst., 27(3):291–307, November 2006.CrossRef Alice Marascu and Florent Masseglia. Mining sequential patterns from data streams: A centroid approach. J. Intell. Inf. Syst., 27(3):291–307, November 2006.CrossRef
63.
Zurück zum Zitat Guojun Mao, Xindong Wu, Xingquan Zhu, Gong Chen, and Chunnian Liu. Mining maximal frequent itemsets from data streams. Journal of Information Science, 33(3):251–262, 2007.CrossRef Guojun Mao, Xindong Wu, Xingquan Zhu, Gong Chen, and Chunnian Liu. Mining maximal frequent itemsets from data streams. Journal of Information Science, 33(3):251–262, 2007.CrossRef
64.
Zurück zum Zitat Luiz F. Mendes, Bolin Ding, and Jiawei Han. Stream sequential pattern mining with precise error bounds. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, ICDM '08, pages 941–946, Washington, DC, USA, 2008. IEEE Computer Society. Luiz F. Mendes, Bolin Ding, and Jiawei Han. Stream sequential pattern mining with precise error bounds. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, ICDM '08, pages 941–946, Washington, DC, USA, 2008. IEEE Computer Society.
65.
Zurück zum Zitat Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. Efficient computation of frequent and top-k elements in data streams. In Proceedings of the 10th International Conference on Database Theory, ICDT'05, pages 398–412, Berlin, Heidelberg, 2005. Springer-Verlag. Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. Efficient computation of frequent and top-k elements in data streams. In Proceedings of the 10th International Conference on Database Theory, ICDT'05, pages 398–412, Berlin, Heidelberg, 2005. Springer-Verlag.
66.
Zurück zum Zitat Jayadev Misra and David Gries. Finding repeated elements. Technical report, Cornell University, Ithaca, NY, USA, 1982. Jayadev Misra and David Gries. Finding repeated elements. Technical report, Cornell University, Ithaca, NY, USA, 1982.
67.
Zurück zum Zitat Willie Ng and Manoranjan Dash. Efficient approximate mining of frequent patterns over transactional data streams. In Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery, DaWaK '08, pages 241–250, Berlin, Heidelberg, 2008. Springer-Verlag. Willie Ng and Manoranjan Dash. Efficient approximate mining of frequent patterns over transactional data streams. In Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery, DaWaK '08, pages 241–250, Berlin, Heidelberg, 2008. Springer-Verlag.
68.
Zurück zum Zitat Debprakash Patnaik, Srivatsan Laxman, Badrish Chandramouli, and Naren Ramakrishnan. A general streaming algorithm for pattern discovery. Knowledge and Information Systems, pages 1–26, 2013. Debprakash Patnaik, Srivatsan Laxman, Badrish Chandramouli, and Naren Ramakrishnan. A general streaming algorithm for pattern discovery. Knowledge and Information Systems, pages 1–26, 2013.
69.
Zurück zum Zitat Chedy Raïssi, Pascal Poncelet, and Maguelonne Teisseire. Need for SPEED: Mining sequential patterns in data streams. BDA'05: Bases de données Avanées Actes, 2005. Chedy Raïssi, Pascal Poncelet, and Maguelonne Teisseire. Need for SPEED: Mining sequential patterns in data streams. BDA'05: Bases de données Avanées Actes, 2005.
70.
Zurück zum Zitat Chedy Raïssi, Pascal Poncelet, and Maguelonne Teisseire. Towards a new approach for mining frequent itemsets on data stream. Journal of Intelligent Information Systems, 28(1):23–36, 2007.CrossRef Chedy Raïssi, Pascal Poncelet, and Maguelonne Teisseire. Towards a new approach for mining frequent itemsets on data stream. Journal of Intelligent Information Systems, 28(1):23–36, 2007.CrossRef
71.
Zurück zum Zitat Syed Khairuzzaman Tanbeer, Chowdhury Farhan Ahmed, Byeong-Soo Jeong, and Young-Koo Lee. Sliding window-based frequent pattern mining over data streams. Information sciences, 179(22):3843–3865, 2009.CrossRefMathSciNet Syed Khairuzzaman Tanbeer, Chowdhury Farhan Ahmed, Byeong-Soo Jeong, and Young-Koo Lee. Sliding window-based frequent pattern mining over data streams. Information sciences, 179(22):3843–3865, 2009.CrossRefMathSciNet
72.
Zurück zum Zitat Wei-Guang Teng, Ming-Syan Chen, and Philip S. Yu. A regression-based temporal pattern mining scheme for data streams. In Proceedings of the 29th international conference on Very large data bases-Volume 29, VLDB '03, pages 93–104. VLDB Endowment, 2003. Wei-Guang Teng, Ming-Syan Chen, and Philip S. Yu. A regression-based temporal pattern mining scheme for data streams. In Proceedings of the 29th international conference on Very large data bases-Volume 29, VLDB '03, pages 93–104. VLDB Endowment, 2003.
73.
Zurück zum Zitat H. Toivonen. Sampling large databases for association rules. In Proc. of the 22nd VLDM Conference., 1996. H. Toivonen. Sampling large databases for association rules. In Proc. of the 22nd VLDM Conference., 1996.
74.
Zurück zum Zitat Raymond Chi-Wing Wong and Ada Wai-Chee Fu. Mining top-k frequent itemsets from data streams. Data Mining and Knowledge Discovery, 13(2):193–217, 2006.CrossRefMathSciNet Raymond Chi-Wing Wong and Ada Wai-Chee Fu. Mining top-k frequent itemsets from data streams. Data Mining and Knowledge Discovery, 13(2):193–217, 2006.CrossRefMathSciNet
75.
Zurück zum Zitat Dong Xin, Jiawei Han, Xifeng Yan, and Hong Cheng. Mining compressed frequent-pattern sets. In VLDB, 2005. Dong Xin, Jiawei Han, Xifeng Yan, and Hong Cheng. Mining compressed frequent-pattern sets. In VLDB, 2005.
76.
Zurück zum Zitat Xifeng Yan and Jiawei Han. gspan: Graph-based substructure pattern mining. In ICDM '02: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM'02), page 721, 2002. Xifeng Yan and Jiawei Han. gspan: Graph-based substructure pattern mining. In ICDM '02: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM'02), page 721, 2002.
77.
Zurück zum Zitat Jeffery Xu Yu, Zhihong Chong, Hongjun Lu, and Aoying Zhou.Li False positive or false negative: mining frequent itemsets from high speed transactional data streams. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, VLDB '04, pages 204–215. VLDB Endowment, 2004. Jeffery Xu Yu, Zhihong Chong, Hongjun Lu, and Aoying Zhou.Li False positive or false negative: mining frequent itemsets from high speed transactional data streams. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, VLDB '04, pages 204–215. VLDB Endowment, 2004.
78.
Zurück zum Zitat Mohammed J. Zaki. Efficiently mining frequent trees in a forest. In KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 71–80, 2002. Mohammed J. Zaki. Efficiently mining frequent trees in a forest. In KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 71–80, 2002.
79.
Zurück zum Zitat M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. In 3rd Intl. Conf. on Knowledge Discovery and Data Mining., August 1997. M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. In 3rd Intl. Conf. on Knowledge Discovery and Data Mining., August 1997.
80.
Zurück zum Zitat Mohammed J. Zaki and Charu C. Aggarwal. Xrules: an effective structural classifier for xml data. In KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 316–325, 2003. Mohammed J. Zaki and Charu C. Aggarwal. Xrules: an effective structural classifier for xml data. In KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 316–325, 2003.
Metadaten
Titel
Frequent Pattern Mining in Data Streams
verfasst von
Victor E. Lee
Ruoming Jin
Gagan Agrawal
Copyright-Jahr
2014
DOI
https://doi.org/10.1007/978-3-319-07821-2_9

Premium Partner