Introduction
-
Identification of the major rare pattern mining challenges through experimental analysis using real-life and synthetic datasets.
-
Comparison between the area of frequent and rare pattern mining with respect to the number of initiatives taken.
-
Illustration of significant future directions for the area of rare pattern mining.
Significance of rare patterns and rare association rules
Rare pattern mining methodologies
Extensions of apriori
Datasets | Number of transactions | Number of items | Average transaction Size | Type |
---|---|---|---|---|
Mushroom | 8124 | 119 | 23 | Dense |
Connect-4 | 67,557 | 129 | 43 | Dense |
Gazelle | 59,602 | 267 | 10 | Sparse |
Retail | 88,162 | 16,470 | 10 | Sparse |
Extensions of FP-growth
Major research challenges for rare pattern mining techniques
Mining rare patterns from databases with different data characteristics
Experimental analysis
Mining rare patterns from advanced data types
Mining rare patterns from sequential databases
Experimental analysis
ID | Sequence |
---|---|
1 |
\(<\hbox {b}(\underline{\hbox {ac}}\hbox {e})(\hbox {bd})\hbox {a}(\hbox {ef})>\)
|
2 |
\(<(\hbox {bd})\hbox {c}(\hbox {ab})(\hbox {de})>\)
|
3 |
\(<(\hbox {cf})(\underline{\hbox {ac}})(\hbox {ef})\underline{\hbox {d}}\hbox {b}>\)
|
4 |
\(<\hbox {fg}(\hbox {bf})\hbox {ada}>\)
|
5 |
\(<\hbox {e}(\hbox {dc})\hbox {fb}(\hbox {a})>\)
|
Mining rare patterns from time-series or spatiotemporal databases
Experimental analysis
Mining rare patterns from graph databases
Experimental analysis
Mining rare patterns from large and high dimensional databases
Experimental analysis
Mining rare patterns from incremental data
Experimental analysis
Mining rare patterns from data streams
Experimental analysis
Tid | Items | Tid | Items | |
---|---|---|---|---|
(a) Original database | (b) Updated database | |||
D |
\(\hbox {D}^{\prime }\)
| |||
1 | a, b, c |
\(\hbox {D}-\)
| 1 | a,b,c |
2 | a, b, e | 2 | a, b, e | |
3 | a, b, d, f | 3 | a, b, d, f | |
4 | a, b | 4 | a, b | |
5 | c, d, e | 5 | c, d, e | |
6 | a, b, e, f | 6 | a, b, e, f | |
\(\hbox {D}+\)
| 7 | a,b,d | ||
8 | a, b, c, d |
Frequent vs rare pattern mining: a comparison
Issues handled by frequent and rare pattern mining techniques
Issues | Articles handling the issues | No. of articles |
---|---|---|
Mining data with different data characteristics | 7 | |
Mining sequential patterns | 38 | |
Mining time-series and spatiotemporal databases | 19 | |
Mining graph databases | 17 | |
Mining large and high dimensional databases | 11 | |
Handling incremental data | 19 | |
Handling data streams | 27 |
Comparison between frequent and rare pattern mining techniques
Future directions for rare pattern mining
Issues | Articles handling the issues | No. of articles |
---|---|---|
Mining data with different data characteristics | Nil | – |
Mining sequential patterns | 2 | |
Mining time-series and spatiotemporal databases | Nil | – |
Mining graph databases | [170] | 1 |
Mining large and high dimensional databases | Nil | – |
Handling incremental data | Nil | – |
Handling data streams | 5 |