Abstract
We consider the problem of mining association rules over interval data (that is, ordered data for which the separation between data points has meaning). We show that the measures of what rules are most important (also called rule interest) that are used for mining nominal and ordinal data do not capture the semantics of interval data. In the presence of interval data, support and confidence are no longer intuitive measures of the interest of a rule. We propose a new definition of interest for association rules that takes into account the semantics of interval data. We developed an algorithm for mining association rules under the new definition and overview our experience using the algorithm on large real-life datasets.
- AIS93 R. Agrawal, T. Imielinksi, and A. Swami. Mining Association Rules between Sets of Items in Large Databases. In Proc. of the A CM SIGMOD Int'l Conf. on Management o} Data, Washington, DC, May 1993. Google Scholar
- AS94 R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. In Proc. of the Int 'l Con}. on Very Large Data Bases (VLDB), Santiago, Chile, September 1994. Google Scholar
- CNFF96 D. W. Cheung, V. T. Ng, A. W. Fu, and Y. Fu. Efficient Mining of Association Rules in Distributed Databases. IEEE TKDE, 1996. To appear. Google Scholar
- EKX95 M. Ester, H.-P. Kriegel, and X. Xu. A Database Interface for Clustering in Large Spatial Databases. In Proc. of the Int'l Conf. on Knowledge Discovery ~4 Data Mining, 1995.Google Scholar
- Eve93 B.S. Everitt. Cluster Analysis. Edward Arnold and Halsted Press, New York- Toronto, 1993.Google Scholar
- FPSM91 W. J. Frawley, G. Piatetsky-Shapiro, and C. J. Matheus. Knowledge Discovery in Databases: An Overview. In {PSF91}, 1991.Google Scholar
- HCC93 J. Hail, Y. Cai, and N. Cercone. Data-Driven Discovery of Quantitative Rules in Relational Databases. IEEE Transactions on Knowledge and Data Engineering, 5(1):29-40, 1993. Google Scholar
- HF95 J. Han and Y. Fu. Discovery of Multiple-level As- Rules from Large Databases. In Proc. the lnt'l Conf. on Very Large Data Bases (VLDB), Zurich, Switzerland, September 1995. Google Scholar
- HS95 M. Houtsma and A. Swami. Set-oriented Mining of Association Rules. In Proc. of the Int 7 Conf. on Data Engineering, Taipei, Taiwan, March 1995 Google Scholar
- JD88 h.K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs, NJ, 1988. Google Scholar
- KR90 L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, inc., NY, 1990.Google Scholar
- Mil95 G.W. Milligan. Clustering Validation: Results and Implications for Applied Analyses. In P. Arable, L. J. Huber, and G. DeSoete, editors, Clustering and Classification, pages 345-375. World Scientific Publishing, River Edge, NJ, 1995.Google Scholar
- MTV94 H. Mannila, H. Toivonen, and A. I. Verkamo. Efficient Algorithms for Discovering Association Rules. In Proc. of the AAAI Workshop on Knowledge Discovery in Databases, pages 181- 192, Seattle, WA, july 1994.Google Scholar
- NH94 R.T. Ng and J. Han. Efficient and Effective ClusteringMethods for Spatila Data Mining. In Proc. of the int'l Conf. on Very Large Data Bases (VLDB), 1994. Google Scholar
- PCY95 J.S. Park, M.-S. Chen, and P. S. Yu. An Effective Hash Based Algorithm for Mining Association Rules. In Proc. of the A CM SIGMOD Int'l Conf. on Management of Data, San Jose, CA, May 1995. Google Scholar
- PS91 G. Piatetsky-Shapiro. Discovery, Analysis, and Presentation of Strong Rules. In {PSF91}, pages 229-248, 1991.Google Scholar
- PSF91 G. Piatetsky-Shapiro and W. J. Frawley. Knowledge Discovery in Databases. AAAI Press/MIT Press, Cambridge, MA, 1991. Google Scholar
- SA95 R. Srikant and R. Agrawal. Mining Generalized Association Rules. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), Zurich, Switzerland, September 1995. Google Scholar
- SA96 R. Srikant and R. Agrawal. Mining Quantitative Association Rules in Large Relational Tables. In Proc. of the A CM SIGMOD Int'l Conf. on Management of Data, Montreal, Canada, 1996. Google Scholar
- SON95 A. Savasere, E. Omiecinski, and S. Navathe. An Efficient Algorithm for Mining Association Rules in Large Databases. In Proc. of the Int'l Con}. on Very Large Data Bases (VLDB), Zurich, Switzerland, September 1995. Google Scholar
- Toi96 H. Toivonen. Sampling Large Databases for Association Rules. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), Bombay, India, 1996. Google Scholar
- WM92 W. H. Wolberg and O. Mangasarian. Wisconsin Breast Cancer Database. In P. M. Murphy and D. W. Aha, editors, UCI Repository of Machine Learning Databases, Irvine, CA, 1992. University of California, Department of Information and Computer Science. http://www.ics, uci. edu / .-mlear n/M L Reposi tory. ht m I.Google Scholar
- ZRL96 T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: An Efficient Data Clustering Method for Very Large Databases. In Proc. of the A CM SIG- MOD Int 'l Conf. on Management of Data, Montreal, Canada, 1996. Google Scholar
- ZRL97 T. Zhang, R. Ramakrishnan, and M. Livny. Data Clustering System BIRCH and Its Applications. Submitted for publication, 1997.Google Scholar
Index Terms
- Association rules over interval data
Recommendations
Association rules over interval data
SIGMOD '97: Proceedings of the 1997 ACM SIGMOD international conference on Management of dataWe consider the problem of mining association rules over interval data (that is, ordered data for which the separation between data points has meaning). We show that the measures of what rules are most important (also called rule interest) that are used ...
Mining fuzzy association rules from questionnaire data
Association rule mining is one of most popular data analysis methods that can discover associations within data. Association rule mining algorithms have been applied to various datasets, due to their practical usefulness. Little attention has been paid, ...
Maximal Association Rules: A Tool for Mining Associations in Text
We describe a new tool for mining association rules, which is of special value in text mining. The new tool, called maximal associations , is geared toward discovering associations that are frequently lost when using regular association rules. ...
Comments