Skip to main content
Erschienen in: Knowledge and Information Systems 1/2015

01.04.2015 | Regular Paper

An effective association rule mining scheme using a new generic basis

verfasst von: Jayakrushna Sahoo, Ashok Kumar Das, A. Goswami

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Association rule mining among itemsets is a fundamental task and is of great importance in many data mining applications including attacks in network data, stock market, financial applications, bioinformatics to find genetic disorders, etc. However, association rule extraction from a reasonable-sized database produces a large number of rules. As a result, many of them are redundant to other rules, and they are practically useless. To overcome this issue, methods for mining non-redundant rules are essentially required. To address such problem, we initially propose a definition for redundancy in sense of minimal knowledge and then a compact representation of non-redundant association rules which we call as compact informative generic basis. We also provide an improved version of the existing DCI_CLOSED algorithm (DCI_PLUS) to find out the frequent closed itemsets (FCI) with their minimal representative generators in combination with BitTable which represents a compact database form in a single scan of the original database. We further introduce an algorithm for constructing the compact informative generic basis from the FCI and their generators in an efficient way. We finally present an inference mechanism in which all association rules can be generated without accessing the database. Experiments are performed on the proposed method. The experimental results show that the proposed method outperforms the other existing related methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data (SIGMOD ’93), pp 207–216 Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data (SIGMOD ’93), pp 207–216
2.
Zurück zum Zitat Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases (VLDB ’94), pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases (VLDB ’94), pp 487–499
4.
Zurück zum Zitat Bastide Y, Taouil R, Pasquier N, Stumme G, Lakhal L (2000) Mining frequent patterns with counting inference. SIGKDD Explor Newsl 2(2):66–75CrossRef Bastide Y, Taouil R, Pasquier N, Stumme G, Lakhal L (2000) Mining frequent patterns with counting inference. SIGKDD Explor Newsl 2(2):66–75CrossRef
5.
Zurück zum Zitat Bonchi F, Lucchese C (2006) On condensed representations of constrained frequent patterns. Knowl Inf Syst 9(2):180–201CrossRef Bonchi F, Lucchese C (2006) On condensed representations of constrained frequent patterns. Knowl Inf Syst 9(2):180–201CrossRef
6.
Zurück zum Zitat Boulicaut JF, Bykowski A, Rigotti C (2003) Free-sets: a condensed representation of Boolean data for the approximation of frequency queries. Data Min Knowl Disc 7(1):5–22CrossRefMathSciNet Boulicaut JF, Bykowski A, Rigotti C (2003) Free-sets: a condensed representation of Boolean data for the approximation of frequency queries. Data Min Knowl Disc 7(1):5–22CrossRefMathSciNet
7.
Zurück zum Zitat Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the 1997 ACM SIGMOD international conference on management of data (SIGMOD ’97), pp 255–264 Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the 1997 ACM SIGMOD international conference on management of data (SIGMOD ’97), pp 255–264
9.
Zurück zum Zitat Casali A, Cicchetti R, Lakhal L (2005) Essential patterns: a perfect cover of frequent patterns. In: Proceedings of the 7th international conference on data warehousing and knowledge, discovery (DaWaK’05), pp 428–437 Casali A, Cicchetti R, Lakhal L (2005) Essential patterns: a perfect cover of frequent patterns. In: Proceedings of the 7th international conference on data warehousing and knowledge, discovery (DaWaK’05), pp 428–437
10.
Zurück zum Zitat Cheng J, Ke Y, Ng W (2008) Effective elimination of redundant association rules. Data Min Knowl Disc 16(2):221–249CrossRefMathSciNet Cheng J, Ke Y, Ng W (2008) Effective elimination of redundant association rules. Data Min Knowl Disc 16(2):221–249CrossRefMathSciNet
11.
Zurück zum Zitat Dong J, Han M (2007) BitTableFI: an efficient mining frequent itemsets algorithm. Knowl Based Syst 20(4):329–335CrossRef Dong J, Han M (2007) BitTableFI: an efficient mining frequent itemsets algorithm. Knowl Based Syst 20(4):329–335CrossRef
14.
15.
Zurück zum Zitat Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-trees. IEEE Trans Knowl Data Eng 17(10):1347–1362CrossRef Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-trees. IEEE Trans Knowl Data Eng 17(10):1347–1362CrossRef
16.
Zurück zum Zitat Hamrouni T, Yahia SB, Nguifo EM (2008) Succinct minimal generators: theoretical foundations and applications. Int J Found Comput Sci 19(2):271–296CrossRefMATH Hamrouni T, Yahia SB, Nguifo EM (2008) Succinct minimal generators: theoretical foundations and applications. Int J Found Comput Sci 19(2):271–296CrossRefMATH
17.
Zurück zum Zitat Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data (SIGMOD ’00), pp 1–12 Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data (SIGMOD ’00), pp 1–12
18.
Zurück zum Zitat Jin R, Xiang Y, Liu L (2009) Cartesian contour: a concise representation for a collection of frequent sets. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’09), pp 417–426 Jin R, Xiang Y, Liu L (2009) Cartesian contour: a concise representation for a collection of frequent sets. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’09), pp 417–426
19.
Zurück zum Zitat Kryszkiewicz M (1998) Representative association rules. In: Wu X, Kotagiri R, Korb KB (eds) Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD ’98). Lecture notes in computer science 1394. Springer, Berlin, pp 198–209 Kryszkiewicz M (1998) Representative association rules. In: Wu X, Kotagiri R, Korb KB (eds) Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD ’98). Lecture notes in computer science 1394. Springer, Berlin, pp 198–209
20.
Zurück zum Zitat Lin DI, Kedem ZM (2002) Pincer-search: an efficient algorithm for discovering the maximum frequent set. IEEE Trans Knowl Data Eng 14(3):553–566CrossRef Lin DI, Kedem ZM (2002) Pincer-search: an efficient algorithm for discovering the maximum frequent set. IEEE Trans Knowl Data Eng 14(3):553–566CrossRef
21.
Zurück zum Zitat Liu G, Li J, Wong L (2008) A new concise representation of frequent itemsets using generators and a positive border. Knowl Inf Syst 17(1):35–56CrossRefMathSciNet Liu G, Li J, Wong L (2008) A new concise representation of frequent itemsets using generators and a positive border. Knowl Inf Syst 17(1):35–56CrossRefMathSciNet
22.
Zurück zum Zitat Lucchese C, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36CrossRef Lucchese C, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36CrossRef
23.
Zurück zum Zitat Ng RT, Lakshmanan LVS, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained associations rules. SIGMOD Rec 27(2):13–24CrossRef Ng RT, Lakshmanan LVS, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained associations rules. SIGMOD Rec 27(2):13–24CrossRef
24.
Zurück zum Zitat Palshikar GK, Kale MS, Apte MM (2007) Association rules mining using heavy itemsets. Data Knowl Eng 61(1):93–113CrossRef Palshikar GK, Kale MS, Apte MM (2007) Association rules mining using heavy itemsets. Data Knowl Eng 61(1):93–113CrossRef
25.
Zurück zum Zitat Park JS, Chen M-S, Yu PS (1995) An effective hash-based algorithm for mining association rules. In: Proceedings of the 1995 ACM SIGMOD international conference on management of data (SIGMOD ’95), pp 175–186 Park JS, Chen M-S, Yu PS (1995) An effective hash-based algorithm for mining association rules. In: Proceedings of the 1995 ACM SIGMOD international conference on management of data (SIGMOD ’95), pp 175–186
26.
Zurück zum Zitat Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inf Syst 24(1):25–46CrossRef Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inf Syst 24(1):25–46CrossRef
27.
Zurück zum Zitat PYR Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceedings of the 7th international conference on database theory (ICDT ’99), pp 398–416 PYR Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceedings of the 7th international conference on database theory (ICDT ’99), pp 398–416
28.
Zurück zum Zitat Pasquier N, Taouil R, Bastide Y, Stumme G, Lakhal L (2005) Generating a condensed representation for association rules. J Intell Inf Syst 24(1):29–60CrossRefMATH Pasquier N, Taouil R, Bastide Y, Stumme G, Lakhal L (2005) Generating a condensed representation for association rules. J Intell Inf Syst 24(1):29–60CrossRefMATH
29.
Zurück zum Zitat Pei J, Han J, Mao R (2000) CLOSET: an efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD workshop on research issues in, data mining and knowledge discovery, pp 21–30 Pei J, Han J, Mao R (2000) CLOSET: an efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD workshop on research issues in, data mining and knowledge discovery, pp 21–30
30.
Zurück zum Zitat Pei J, Han J, Lu H, Nishio S, Tang S, Yang D (2001) H-mine: hyper-structure mining of frequent patterns in large databases. In: Proceedings of the 2001 IEEE international conference on data mining (ICDM ’01), pp 441–448 Pei J, Han J, Lu H, Nishio S, Tang S, Yang D (2001) H-mine: hyper-structure mining of frequent patterns in large databases. In: Proceedings of the 2001 IEEE international conference on data mining (ICDM ’01), pp 441–448
31.
Zurück zum Zitat Pei J, Dong G, Zou W, Han J (2004) Mining condensed frequent-pattern bases. Knowl Inf Syst 6(5):570–594CrossRef Pei J, Dong G, Zou W, Han J (2004) Mining condensed frequent-pattern bases. Knowl Inf Syst 6(5):570–594CrossRef
32.
Zurück zum Zitat Singh NG, Singh SR, Mahanta AK (2005) CloseMiner: discovering frequent closed itemsets using frequent closed tidsets. In: Proceedings of the fifth IEEE international conference on data mining (ICDM ’05), pp 633–636 Singh NG, Singh SR, Mahanta AK (2005) CloseMiner: discovering frequent closed itemsets using frequent closed tidsets. In: Proceedings of the fifth IEEE international conference on data mining (ICDM ’05), pp 633–636
33.
Zurück zum Zitat Song W, Yang B, Xu Z (2008) Index-CloseMiner: an improved algorithm for mining frequent closed itemset. Intell Data Anal 12(4):321–338MathSciNet Song W, Yang B, Xu Z (2008) Index-CloseMiner: an improved algorithm for mining frequent closed itemset. Intell Data Anal 12(4):321–338MathSciNet
34.
Zurück zum Zitat Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Proceedings of the 3rd international conference on knowledge discovery in databases and data mining, pp 67–73 Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Proceedings of the 3rd international conference on knowledge discovery in databases and data mining, pp 67–73
35.
Zurück zum Zitat Tsai PSM, Chen CM (2004) Mining interesting association rules from customer databases and transaction databases. Inf Syst 29(8):685–696CrossRef Tsai PSM, Chen CM (2004) Mining interesting association rules from customer databases and transaction databases. Inf Syst 29(8):685–696CrossRef
36.
Zurück zum Zitat Vo B, Hong TP, Le B (2012) DBV-Miner: a dynamic bit-vector approach for fast mining frequent closed itemsets. Expert Syst Appl 39(8):7196–7206CrossRef Vo B, Hong TP, Le B (2012) DBV-Miner: a dynamic bit-vector approach for fast mining frequent closed itemsets. Expert Syst Appl 39(8):7196–7206CrossRef
37.
Zurück zum Zitat Wang J, Han J, Pei J (2003) CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’03), pp 236–245 Wang J, Han J, Pei J (2003) CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’03), pp 236–245
38.
Zurück zum Zitat Xu Y, Li Y, Shaw G (2011) Reliable representations for association rules. Data Knowl Eng 70(6):555–575CrossRef Xu Y, Li Y, Shaw G (2011) Reliable representations for association rules. Data Knowl Eng 70(6):555–575CrossRef
39.
Zurück zum Zitat Yahia SB, Gasmi G, Nguifo EM (2009) A new generic basis of “factual” and “implicative” association rules. Intell Data Anal 13(4):633–656 Yahia SB, Gasmi G, Nguifo EM (2009) A new generic basis of “factual” and “implicative” association rules. Intell Data Anal 13(4):633–656
40.
Zurück zum Zitat Yen SJ, Lee YS (2002) Mining interesting association rules: a data mining language. In: Proceedings of the 6th Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD ’02), pp 172–176 Yen SJ, Lee YS (2002) Mining interesting association rules: a data mining language. In: Proceedings of the 6th Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD ’02), pp 172–176
41.
Zurück zum Zitat Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. Technical report, University of Rochester, Rochester, NY Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. Technical report, University of Rochester, Rochester, NY
42.
Zurück zum Zitat Zaki MJ (April 2002) Hsiao CJ (2002) CHARM: an efficient algorithm for closed itemset mining. In: Grossman R, Han J, Kumar V, Mannila H, Motwani R (eds) Proceedings of the second SIAM international conference on data mining. Arlington, VA, pp 457–473 Zaki MJ (April 2002) Hsiao CJ (2002) CHARM: an efficient algorithm for closed itemset mining. In: Grossman R, Han J, Kumar V, Mannila H, Motwani R (eds) Proceedings of the second SIAM international conference on data mining. Arlington, VA, pp 457–473
43.
Zurück zum Zitat Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’03), pp 326–335 Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’03), pp 326–335
45.
Zurück zum Zitat Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478CrossRef Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478CrossRef
Metadaten
Titel
An effective association rule mining scheme using a new generic basis
verfasst von
Jayakrushna Sahoo
Ashok Kumar Das
A. Goswami
Publikationsdatum
01.04.2015
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2015
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-014-0732-4

Weitere Artikel der Ausgabe 1/2015

Knowledge and Information Systems 1/2015 Zur Ausgabe

Premium Partner