Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 2/2017

17.02.2015 | Original Article

TDUP: an approach to incremental mining of frequent itemsets with three-way-decision pattern updating

verfasst von: Yao Li, Zhi-Heng Zhang, Wen-Bin Chen, Fan Min

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Finding an efficient approach to incrementally update and maintain frequent itemsets is an important aspect of data mining. Earlier incremental algorithms focused on reducing the number of scans of the original database while it is updated. However, they still required the database to be rescanned in some situations. Here we propose a three-way decision update pattern approach (TDUP) along with a synchronization mechanism for this issue. With two support-based measures, all possible itemsets are divided into positive, boundary, and negative regions. TDUP efficiently updates frequent itemsets online, while the synchronization mechanism is periodically triggered to recompute the itemsets offline. The operation of the mechanism based on appropriate settings of two support-based measures is examined through experiments. Results from three real-world data sets show that the proposed approach is efficient and reliable.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB conference, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB conference, pp 487–499
2.
Zurück zum Zitat Vo B, Le T, Coenen F, Hong TP (2014) Mining frequent itemsets using the n-list and subsume concepts. Int J Mach Learn Cybern, pp 1–13 Vo B, Le T, Coenen F, Hong TP (2014) Mining frequent itemsets using the n-list and subsume concepts. Int J Mach Learn Cybern, pp 1–13
3.
Zurück zum Zitat Duan L, Street WN (2014) Speeding up maximal fully-correlated itemsets search in large databases. Int J Mach Learn Cybern, pp 1–11 Duan L, Street WN (2014) Speeding up maximal fully-correlated itemsets search in large databases. Int J Mach Learn Cybern, pp 1–11
4.
Zurück zum Zitat Li Y, Zhang ZH, Chen WB, Min F (2014) Mining high utility itemsets with discount strategies. Inf Comput Sci 11:6297–6307CrossRef Li Y, Zhang ZH, Chen WB, Min F (2014) Mining high utility itemsets with discount strategies. Inf Comput Sci 11:6297–6307CrossRef
5.
Zurück zum Zitat Kopa M, D’Ecclesia RL, Tichy T (2012) Financial modeling Kopa M, D’Ecclesia RL, Tichy T (2012) Financial modeling
6.
Zurück zum Zitat Higgins RC, Reimers M (2007) Analysis for financial management. McGraw-Hill, Irwin Higgins RC, Reimers M (2007) Analysis for financial management. McGraw-Hill, Irwin
7.
Zurück zum Zitat Ahmed KM, El-Makky NM, Taha Y (2000) A note on beyond market baskets: generalizing association rules to correlations. ACM SIGKDD Explor Newsl 1(2):46–48CrossRef Ahmed KM, El-Makky NM, Taha Y (2000) A note on beyond market baskets: generalizing association rules to correlations. ACM SIGKDD Explor Newsl 1(2):46–48CrossRef
8.
Zurück zum Zitat Berry MJ, Linoff GS (2004) Data mining techniques: for marketing, sales, and customer relationship management. Wiley Berry MJ, Linoff GS (2004) Data mining techniques: for marketing, sales, and customer relationship management. Wiley
9.
Zurück zum Zitat Min F, Zhu W (2012) Granular association rule mining through parametric rough sets. In: brain informatics. Springer, pp 320–331 Min F, Zhu W (2012) Granular association rule mining through parametric rough sets. In: brain informatics. Springer, pp 320–331
10.
Zurück zum Zitat Min F, Hu QH, Zhu W (2014) Feature selection with test cost constraint. Int J Approax Reason 55(1) Min F, Hu QH, Zhu W (2014) Feature selection with test cost constraint. Int J Approax Reason 55(1)
11.
Zurück zum Zitat Yang XB, Qi YS, Song XN, Yang JY (2013) Test cost sensitive multigranulation rough set: model and minimal cost selection. Inf Sci 250:184–199MathSciNetCrossRefMATH Yang XB, Qi YS, Song XN, Yang JY (2013) Test cost sensitive multigranulation rough set: model and minimal cost selection. Inf Sci 250:184–199MathSciNetCrossRefMATH
12.
Zurück zum Zitat Wikipedia: big data. http://en.wikipedia.org/wiki/Big\_data#cite\_note-15 Wikipedia: big data. http://​en.​wikipedia.​org/​wiki/​Big\_data#cite\_note-15
13.
Zurück zum Zitat Chang CC, Li YC, Lee JS (2005) An efficient algorithm for incremental mining of association rules. In: IEEE research issues in data engineering: stream data mining and applications, pp 3–10 Chang CC, Li YC, Lee JS (2005) An efficient algorithm for incremental mining of association rules. In: IEEE research issues in data engineering: stream data mining and applications, pp 3–10
14.
Zurück zum Zitat Cheung DW, Han J, Ng VT, Wong CY (1996) Maintenance of discovered association rules in large databases: an incremental updating approach. In: Proceedings of the twelfth international conference, data engineering, pp 106–114 Cheung DW, Han J, Ng VT, Wong CY (1996) Maintenance of discovered association rules in large databases: an incremental updating approach. In: Proceedings of the twelfth international conference, data engineering, pp 106–114
15.
Zurück zum Zitat Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases, pp 487–499
16.
Zurück zum Zitat Yao YY (2012) An outline of a theory of three-way decisions. Comput Sci 7413:1–17 Yao YY (2012) An outline of a theory of three-way decisions. Comput Sci 7413:1–17
17.
Zurück zum Zitat Jia XY, Shang L, Zhou XZ, Liang JY, Miao DQ, Wang GY, Li TR, Zhang YP (eds) (2012) Theory of three-way decisions and application (in chinese). Nanjing University Press Jia XY, Shang L, Zhou XZ, Liang JY, Miao DQ, Wang GY, Li TR, Zhang YP (eds) (2012) Theory of three-way decisions and application (in chinese). Nanjing University Press
18.
Zurück zum Zitat Liu D, Li TR, Miao DQ, Wang GY, Liang JY (eds) (2013) Three-way decisions and granular computing (in chinese). Science Press Liu D, Li TR, Miao DQ, Wang GY, Liang JY (eds) (2013) Three-way decisions and granular computing (in chinese). Science Press
19.
Zurück zum Zitat Yu H, Liu ZG, Wang GY (2014) An automatic method to determine the number of clusters using decision-theoretic rough set. Int J Approx Reason 55(1):101–115MathSciNetCrossRefMATH Yu H, Liu ZG, Wang GY (2014) An automatic method to determine the number of clusters using decision-theoretic rough set. Int J Approx Reason 55(1):101–115MathSciNetCrossRefMATH
20.
Zurück zum Zitat Mohamed MH, Darwieesh MM (2013) Efficient mining frequent itemsets algorithms. Int J Mach Learn Cybern, pp 1–11 Mohamed MH, Darwieesh MM (2013) Efficient mining frequent itemsets algorithms. Int J Mach Learn Cybern, pp 1–11
21.
Zurück zum Zitat Pei J, Han JW, Lakshmanan LV (2001) Mining frequent itemsets with convertible constraints. In: Proceedings of 17th international conference, data engineering, pp 433–442 Pei J, Han JW, Lakshmanan LV (2001) Mining frequent itemsets with convertible constraints. In: Proceedings of 17th international conference, data engineering, pp 433–442
22.
Zurück zum Zitat Agarwal RC, Aggarwal CC, Prasad V (2001) A tree projection algorithm for generation of frequent itemsets. J Parallel Distrib Comput 61:350–371CrossRefMATH Agarwal RC, Aggarwal CC, Prasad V (2001) A tree projection algorithm for generation of frequent itemsets. J Parallel Distrib Comput 61:350–371CrossRefMATH
23.
Zurück zum Zitat Cheung W, Zaiane OR (2003) Incremental mining of frequent patterns without candidate generation or support constraint. In: IEEE Proceedings of seventh international symposium applications database engineering, pp 111–116 Cheung W, Zaiane OR (2003) Incremental mining of frequent patterns without candidate generation or support constraint. In: IEEE Proceedings of seventh international symposium applications database engineering, pp 111–116
24.
Zurück zum Zitat Shan S, Wang X, Sui M (2010) Mining association rules: a continuous incremental updating technique. In: International conference web information systems and mining (WISM), 1:62–66 Shan S, Wang X, Sui M (2010) Mining association rules: a continuous incremental updating technique. In: International conference web information systems and mining (WISM), 1:62–66
25.
Zurück zum Zitat Yao YY (2009) Three-way decision: an interpretation of rules in rough set theory. Comput Sci 5589:642–649 Yao YY (2009) Three-way decision: an interpretation of rules in rough set theory. Comput Sci 5589:642–649
26.
Zurück zum Zitat Li HX, Zhou XZ (2011) Risk decision making based on decision-theoretic rough set: a three-way view decision model. Int J Comput Intell Syst 4(1):1–11MathSciNetCrossRef Li HX, Zhou XZ (2011) Risk decision making based on decision-theoretic rough set: a three-way view decision model. Int J Comput Intell Syst 4(1):1–11MathSciNetCrossRef
27.
Zurück zum Zitat Jia XY, Tang ZM, Liao WH, Shang L (2014) On an optimization representation of decision-theoretic rough set models. Int J Approx Reason 55:156–166MathSciNetCrossRefMATH Jia XY, Tang ZM, Liao WH, Shang L (2014) On an optimization representation of decision-theoretic rough set models. Int J Approx Reason 55:156–166MathSciNetCrossRefMATH
29.
Zurück zum Zitat Goethals B, Le Page W, Mampaey M (2010) Mining interesting sets and rules in relational databases. In: Proceedings of the ACM symposium on applied computing, ACM, pp 997–1001 Goethals B, Le Page W, Mampaey M (2010) Mining interesting sets and rules in relational databases. In: Proceedings of the ACM symposium on applied computing, ACM, pp 997–1001
30.
Zurück zum Zitat Han JW, Pei J, Yin YW (2000) Mining frequent patterns without candidate generation. In: ACM SIGMOD record, ACM, 29:1–12 Han JW, Pei J, Yin YW (2000) Mining frequent patterns without candidate generation. In: ACM SIGMOD record, ACM, 29:1–12
31.
Zurück zum Zitat Cheung D, Lee S, Kao B (1997) A general incremental technique for maintaining discovered association rules. In: Proceedings of 5th international conference on database systems for advanced applications, pp 185–194 Cheung D, Lee S, Kao B (1997) A general incremental technique for maintaining discovered association rules. In: Proceedings of 5th international conference on database systems for advanced applications, pp 185–194
32.
Zurück zum Zitat Thomas S, Bodagala S, Alsabti K, Ranka S (1997) An efficient algorithm for the incremental updation of association rules in large databases. In: Proceedings of 3rd international conference on data mining and knowledge discovery, pp 263–266 Thomas S, Bodagala S, Alsabti K, Ranka S (1997) An efficient algorithm for the incremental updation of association rules in large databases. In: Proceedings of 3rd international conference on data mining and knowledge discovery, pp 263–266
33.
Zurück zum Zitat Veloso AA, Jr Meira W, de Carvalho MB, Pôssas B, Parthasarathy S, Zaki MJ (2002) Mining frequent itemsets in evolving databases. In: Proceedings of 2nd SIAM international conference on data mining Veloso AA, Jr Meira W, de Carvalho MB, Pôssas B, Parthasarathy S, Zaki MJ (2002) Mining frequent itemsets in evolving databases. In: Proceedings of 2nd SIAM international conference on data mining
34.
Zurück zum Zitat Ayan N, Tansel A, Arkun E (1999) An efficient algorithm to update large itemsets with early pruning. In: Proceeding KDD proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining, pp 287–291 Ayan N, Tansel A, Arkun E (1999) An efficient algorithm to update large itemsets with early pruning. In: Proceeding KDD proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining, pp 287–291
35.
Zurück zum Zitat Lee CH, Lin CR, Chen MS (2001) Sliding-window filtering: an efficient algorithm for incremental mining. In: Proceeding CIKM proceedings of the tenth international conference on information and knowledge management, pp 263–270 Lee CH, Lin CR, Chen MS (2001) Sliding-window filtering: an efficient algorithm for incremental mining. In: Proceeding CIKM proceedings of the tenth international conference on information and knowledge management, pp 263–270
36.
Zurück zum Zitat Chang C, Li Y, Lee J (2005) An efficient algorithm for incremental mining of association rules. In: 15th international workshop on research issues in data engineering: stream data mining and applications. RIDE-SDMA, pp 3–10 Chang C, Li Y, Lee J (2005) An efficient algorithm for incremental mining of association rules. In: 15th international workshop on research issues in data engineering: stream data mining and applications. RIDE-SDMA, pp 3–10
37.
Zurück zum Zitat Yu H, Wang Y, Jiao P (2014) A three-way decisions approach to density-based overlapping clustering. In: Transactions on rough sets XVIII, pp 92–109 Yu H, Wang Y, Jiao P (2014) A three-way decisions approach to density-based overlapping clustering. In: Transactions on rough sets XVIII, pp 92–109
38.
Zurück zum Zitat Liang DC, Liu D (2014) Systematic studies on three-way decisions with interval-valued decision-theoretic rough sets. Inf Sci 276:186–203CrossRef Liang DC, Liu D (2014) Systematic studies on three-way decisions with interval-valued decision-theoretic rough sets. Inf Sci 276:186–203CrossRef
41.
Zurück zum Zitat Olave M, Rajkovic V, Bohanec M (1989) An application for admission in public school systems. Expert Syst Public Adm, pp 145–160 Olave M, Rajkovic V, Bohanec M (1989) An application for admission in public school systems. Expert Syst Public Adm, pp 145–160
42.
Zurück zum Zitat Zupan B, Bohanec M, Bratko I, Demsar J (1997) Machine learning by function decomposition. In: ICML, pp 421–429 Zupan B, Bohanec M, Bratko I, Demsar J (1997) Machine learning by function decomposition. In: ICML, pp 421–429
43.
Zurück zum Zitat Duch W, Adamczak R, Grabczewski K et al (1997) Extraction of crisp logical rules using constrained backpropagation networks Duch W, Adamczak R, Grabczewski K et al (1997) Extraction of crisp logical rules using constrained backpropagation networks
44.
Zurück zum Zitat Liu HW, Liu L, Zhang HJ (2008) Feature selection using mutual information: an experimental study. In: PRICAI: trends in artificial intelligence. Springer, pp 235–246 Liu HW, Liu L, Zhang HJ (2008) Feature selection using mutual information: an experimental study. In: PRICAI: trends in artificial intelligence. Springer, pp 235–246
45.
Zurück zum Zitat Ceglar A, Roddick JF, Powers DM (2007) Curio: a fast outlier and outlier cluster detection algorithm for large datasets. In: Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining, vol 84. Australian Computer Society, Inc., pp 39–47 Ceglar A, Roddick JF, Powers DM (2007) Curio: a fast outlier and outlier cluster detection algorithm for large datasets. In: Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining, vol 84. Australian Computer Society, Inc., pp 39–47
Metadaten
Titel
TDUP: an approach to incremental mining of frequent itemsets with three-way-decision pattern updating
verfasst von
Yao Li
Zhi-Heng Zhang
Wen-Bin Chen
Fan Min
Publikationsdatum
17.02.2015
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 2/2017
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-015-0337-6

Weitere Artikel der Ausgabe 2/2017

International Journal of Machine Learning and Cybernetics 2/2017 Zur Ausgabe

Neuer Inhalt