Skip to main content
Top
Published in: Knowledge and Information Systems 1/2018

03-10-2017 | Regular Paper

Exploiting highly qualified pattern with frequency and weight occupancy

Authors: Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Justin Zhan, Ji Zhang

Published in: Knowledge and Information Systems | Issue 1/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

By identifying useful knowledge embedded in the behavior of search engines, users can provide valuable information for web searching and data mining. Numerous algorithms have been proposed to find the desired interesting patterns, i.e., frequent pattern, in real-world applications. Most of those studies use frequency to measure the interestingness of patterns. However, each object may have different importance in these real-world applications, and the frequent ones do not usually contain a large portion of the desired patterns. In this paper, we present a novel method, called exploiting highly qualified patterns with frequency and weight occupancy (QFWO), to suggest the possible highly qualified patterns that utilize the idea of co-occurrence and weight occupancy. By considering item weight, weight occupancy and the frequency of patterns, in this paper, we designed a new highly qualified patterns. A novel Set-enumeration tree called the frequency-weight (FW)-tree and two compact data structures named weight-list and FW-table are designed to hold the global downward closure property and partial downward closure property of quality and weight occupancy to further prune the search space. The proposed method can exploit high qualified patterns in a recursive manner without candidate generation. Extensive experiments were conducted both on real-world and synthetic datasets to evaluate the effectiveness and efficiency of the proposed algorithm. Results demonstrate that the obtained patterns are reasonable and acceptable. Moreover, the designed QFWO with several pruning strategies is quite efficient in terms of runtime and search space.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. International conference on very large data bases, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. International conference on very large data bases, pp 487–499
2.
go back to reference Han J, Pei YYJ, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87MathSciNetCrossRef Han J, Pei YYJ, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87MathSciNetCrossRef
3.
go back to reference Tang L, Zhang L, Luo P, Wang M (2012) Incorporating occupancy into frequent pattern mining for high quality pattern recommendation. 21st ACM international conference on information and knowledge management, pp 75–84 Tang L, Zhang L, Luo P, Wang M (2012) Incorporating occupancy into frequent pattern mining for high quality pattern recommendation. 21st ACM international conference on information and knowledge management, pp 75–84
4.
go back to reference Koh YS, Ravana SD (2016) Unsupervised rare pattern mining: a survey. ACM Trans Knowl Discov Data 10(4), Article 45 Koh YS, Ravana SD (2016) Unsupervised rare pattern mining: a survey. ACM Trans Knowl Discov Data 10(4), Article 45
5.
go back to reference Lin CW, Hong TP (2011) Temporal data mining with up-to-date pattern trees. Expert Syst Appl 38(12):15143–15150CrossRef Lin CW, Hong TP (2011) Temporal data mining with up-to-date pattern trees. Expert Syst Appl 38(12):15143–15150CrossRef
6.
go back to reference Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86MathSciNetCrossRef Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86MathSciNetCrossRef
7.
go back to reference Fournier Viger P, Lin JCW, Vo B, Chi TT, Zhang J, Le HB (2017) A survey of itemset mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery Fournier Viger P, Lin JCW, Vo B, Chi TT, Zhang J, Le HB (2017) A survey of itemset mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
8.
go back to reference Geng L, Hamilton HJ (2006) Interestingness measures for data mining: a survey. ACM Comput Surv 38(3), Article 9 Geng L, Hamilton HJ (2006) Interestingness measures for data mining: a survey. ACM Comput Surv 38(3), Article 9
9.
go back to reference Omiecinski ER (2003) Alternative interest measures for mining associations in databases. IEEE Trans Knowl Data Eng 15(1):57–69MathSciNetCrossRef Omiecinski ER (2003) Alternative interest measures for mining associations in databases. IEEE Trans Knowl Data Eng 15(1):57–69MathSciNetCrossRef
10.
go back to reference Cerf L, Besson J, Robardet C, Boulicaut JF (2009) Closed patterns meet n-ary relations. ACM Trans Knowl Discov Data 3(1):3CrossRef Cerf L, Besson J, Robardet C, Boulicaut JF (2009) Closed patterns meet n-ary relations. ACM Trans Knowl Discov Data 3(1):3CrossRef
11.
go back to reference Gouda K, Zaki MJ (2005) Genmax: an efficient algorithm for mining maximal frequent itemsets. Data Min Knowl Disc 11(3):223–242MathSciNetCrossRef Gouda K, Zaki MJ (2005) Genmax: an efficient algorithm for mining maximal frequent itemsets. Data Min Knowl Disc 11(3):223–242MathSciNetCrossRef
12.
go back to reference Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2015) Efficient algorithms for mining up-to-date high-utility patterns. Adv Eng Inf 29(3):648–661CrossRef Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2015) Efficient algorithms for mining up-to-date high-utility patterns. Adv Eng Inf 29(3):648–661CrossRef
13.
go back to reference Cai CH, Fu AWC, Cheng CH, Kwong WW (1998) Mining association rules with weighted items. International database engineering and applications symposium, pp 68–77 Cai CH, Fu AWC, Cheng CH, Kwong WW (1998) Mining association rules with weighted items. International database engineering and applications symposium, pp 68–77
14.
go back to reference Lan GC, Hong TP, Lee HY, Lin CW (2015) Tightening upper bounds for mining weighted frequent itemsets. Intell Data Anal 19(2):413–429 Lan GC, Hong TP, Lee HY, Lin CW (2015) Tightening upper bounds for mining weighted frequent itemsets. Intell Data Anal 19(2):413–429
15.
go back to reference Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2016) Weighted frequent itemset mining over uncertain databases. Appl Intell 41(1):232–250CrossRef Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2016) Weighted frequent itemset mining over uncertain databases. Appl Intell 41(1):232–250CrossRef
16.
go back to reference Nguyen H, Vo B, Nguyen M, Pedrycz W (2016) An efficient algorithm for mining frequent weighted itemsets using interval word segments. Appl Intell 1–13 Nguyen H, Vo B, Nguyen M, Pedrycz W (2016) An efficient algorithm for mining frequent weighted itemsets using interval word segments. Appl Intell 1–13
17.
go back to reference Wang W, Yang J, Yu PS (2000) Efficient mining of weighted association rules (WAR). ACM SIGKDD international conference on knowledge discovery and data mining, pp 211–225 Wang W, Yang J, Yu PS (2000) Efficient mining of weighted association rules (WAR). ACM SIGKDD international conference on knowledge discovery and data mining, pp 211–225
18.
go back to reference Tao F, Murtagh F, Farid M (2003) Weighted association rule mining using weighted support and significance framework. ACM SIGKDD international conference on knowledge discovery and data mining, pp 661–666 Tao F, Murtagh F, Farid M (2003) Weighted association rule mining using weighted support and significance framework. ACM SIGKDD international conference on knowledge discovery and data mining, pp 661–666
19.
go back to reference Yun U, Leggett J (2005) WFIM: weighted frequent itemset mining with a weight range and a minimum weight. SIAM international conference on data mining, pp 636–640 Yun U, Leggett J (2005) WFIM: weighted frequent itemset mining with a weight range and a minimum weight. SIAM international conference on data mining, pp 636–640
20.
go back to reference Vo B, Coenen F, Le B (2013) A new method for mining frequent weighted itemsets based on WIT-trees. Expert Syst Appl 40(4):1256–1264CrossRef Vo B, Coenen F, Le B (2013) A new method for mining frequent weighted itemsets based on WIT-trees. Expert Syst Appl 40(4):1256–1264CrossRef
21.
go back to reference Lin JCW, Gan W, Fournier-Viger P, Hong TP (2015) RWFIM: recent weighted frequent itemsets mining. Eng Appl Artif Intell 45:18–32CrossRef Lin JCW, Gan W, Fournier-Viger P, Hong TP (2015) RWFIM: recent weighted frequent itemsets mining. Eng Appl Artif Intell 45:18–32CrossRef
22.
go back to reference Lin JCW, Gan W, Fournier-Viger P, Hong TP (2016) Efficient algorithms for mining recent weighted frequent itemsets in temporal transactional databases. The 31st annual ACM symposium on applied computing, pp 861–866 Lin JCW, Gan W, Fournier-Viger P, Hong TP (2016) Efficient algorithms for mining recent weighted frequent itemsets in temporal transactional databases. The 31st annual ACM symposium on applied computing, pp 861–866
23.
go back to reference Lin JCW, Gan W, Fournier-Viger P, Chao HC, Wu JMT, Zhan J (2017) Extracting recent weighted-based patterns from uncertain temporal databases. Eng Appl Artif Intell 61:161–172CrossRef Lin JCW, Gan W, Fournier-Viger P, Chao HC, Wu JMT, Zhan J (2017) Extracting recent weighted-based patterns from uncertain temporal databases. Eng Appl Artif Intell 61:161–172CrossRef
24.
go back to reference Pei J, Han J, Lakshmanan L (2001) Mining frequent itemsets with convertible constraints. 17th international conference on data engineering, pp 433–442 Pei J, Han J, Lakshmanan L (2001) Mining frequent itemsets with convertible constraints. 17th international conference on data engineering, pp 433–442
25.
go back to reference Sun K, Bai F (2008) Mining weighted association rules without preassigned weights. IEEE Trans Knowl Data Eng 20(4):489–495MathSciNetCrossRef Sun K, Bai F (2008) Mining weighted association rules without preassigned weights. IEEE Trans Knowl Data Eng 20(4):489–495MathSciNetCrossRef
26.
go back to reference Cagliero L, Garza P (2014) Infrequent weighted itemset mining using frequent pattern growth. IEEE Trans Knowl Data Eng 26(4):903–915CrossRef Cagliero L, Garza P (2014) Infrequent weighted itemset mining using frequent pattern growth. IEEE Trans Knowl Data Eng 26(4):903–915CrossRef
27.
go back to reference Yang KJ, Hong TP, Lan GC, Chen YM (2014) A two-phase approach for mining weighted partial periodic patterns. Eng Appl Artif Intell 30:225–234CrossRef Yang KJ, Hong TP, Lan GC, Chen YM (2014) A two-phase approach for mining weighted partial periodic patterns. Eng Appl Artif Intell 30:225–234CrossRef
28.
go back to reference Lan GC, Hong TP, Lee HY (2014) An efficient approach for finding weighted sequential patterns from sequence databases. Appl Intell 41(2):439–452CrossRef Lan GC, Hong TP, Lee HY (2014) An efficient approach for finding weighted sequential patterns from sequence databases. Appl Intell 41(2):439–452CrossRef
29.
go back to reference Baralis E, Cagliero L, Fiori A, Garza P (2015) MWI-sum: a multilingual summarizer based on frequent weighted itemsets. ACM Trans Inf Syst 34(1), Article 5 Baralis E, Cagliero L, Fiori A, Garza P (2015) MWI-sum: a multilingual summarizer based on frequent weighted itemsets. ACM Trans Inf Syst 34(1), Article 5
30.
go back to reference Pasquier N, Bastide Y, Taouil R, Lakhal L(1998) Pruning closed itemset lattices for association rules. International conference on advanced databases, pp 177–196 Pasquier N, Bastide Y, Taouil R, Lakhal L(1998) Pruning closed itemset lattices for association rules. International conference on advanced databases, pp 177–196
31.
go back to reference Rymon R (1992) Search through systematic set enumeration. Technical Reports (CIS), p 297 Rymon R (1992) Search through systematic set enumeration. Technical Reports (CIS), p 297
32.
go back to reference Deng ZH, Lv SL (2014) Fast mining frequent itemsets using Nodesets. Expert Syst Appl 41(10):4505–4512CrossRef Deng ZH, Lv SL (2014) Fast mining frequent itemsets using Nodesets. Expert Syst Appl 41(10):4505–4512CrossRef
33.
go back to reference Hong TP, Lin CW, Wu YL (2008) Incrementally fast updated frequent pattern trees. Expert Syst Appl 34(4):2424–2435CrossRef Hong TP, Lin CW, Wu YL (2008) Incrementally fast updated frequent pattern trees. Expert Syst Appl 34(4):2424–2435CrossRef
34.
go back to reference Deng ZH, Lv SL (2015) PrePost+: an efficient N-lists-based algorithm for mining frequent itemsets via Children–Parent Equivalence pruning. Expert Syst Appl 42(13):5424–5432CrossRef Deng ZH, Lv SL (2015) PrePost+: an efficient N-lists-based algorithm for mining frequent itemsets via Children–Parent Equivalence pruning. Expert Syst Appl 42(13):5424–5432CrossRef
35.
go back to reference Fournier-Viger P, Lin JCW, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The SPMF open-source data mining library version 2. Joint European conference on machine learning and knowledge discovery in databases, pp 36–40 Fournier-Viger P, Lin JCW, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The SPMF open-source data mining library version 2. Joint European conference on machine learning and knowledge discovery in databases, pp 36–40
37.
go back to reference Zaharia M, Chowdhury M, Das T, Dave A, Ma J (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. The 9th USENIX conference on networked systems design and implementation, p 2 Zaharia M, Chowdhury M, Das T, Dave A, Ma J (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. The 9th USENIX conference on networked systems design and implementation, p 2
38.
go back to reference Lin JCW, Gan W, Hong TP (2015) A fast updated algorithm to maintain the discovered high-utility itemsets for transaction modification. Adv Eng Inf 29(3):562–574CrossRef Lin JCW, Gan W, Hong TP (2015) A fast updated algorithm to maintain the discovered high-utility itemsets for transaction modification. Adv Eng Inf 29(3):562–574CrossRef
Metadata
Title
Exploiting highly qualified pattern with frequency and weight occupancy
Authors
Wensheng Gan
Jerry Chun-Wei Lin
Philippe Fournier-Viger
Han-Chieh Chao
Justin Zhan
Ji Zhang
Publication date
03-10-2017
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 1/2018
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-017-1103-8

Other articles of this Issue 1/2018

Knowledge and Information Systems 1/2018 Go to the issue

Premium Partner