Abstract
Multidimensional databases have been designed to provide decision makers with the necessary tools to help them understand their data. This framework is different from transactional data as the datasets contain huge volumes of historicized and aggregated data defined over a set of dimensions that can be arranged through multiple levels of granularities. Many tools have been proposed to query the data and navigate through the levels of granularity. However, automatic tools are still missing to mine this type of data in order to discover regular specific patterns. In this article, we present a method for mining sequential patterns from multidimensional databases, at the same time taking advantage of the different dimensions and levels of granularity, which is original compared to existing work. The necessary definitions and algorithms are extended from regular sequential patterns to this particular case. Experiments are reported, showing the significance of this approach.
- Agrawal, R. and Srikant, R. 1995. Mining sequential patterns. In Proceedings of the International Conference on Data Engineering (ICDE). IEEE Computer Society, 3--14. Google ScholarDigital Library
- Ayres, J., Flannick, J., Gehrke, J., and Yiu, T. 2002. Sequential pattern mining using a bitmap representation. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD). AAAI Press, 429--435. Google ScholarDigital Library
- Beyer, K. and Ramakrishnan, R. 1999. Bottom-up computation of sparse and iceberg cube. In Proceedings of the International Conference on Management of Data (SIGMOD). ACM Press, 359--370. Google ScholarDigital Library
- Bonchi, F. and Lucchese, C. 2006. On condensed representations of constrained frequent patterns. Knowl. Inform. Syst. 9, 2, 180--201.Google ScholarDigital Library
- Boulicaut, J.-F., Bykowski, A., and Rigotti, C. 2003. Free-sets: a condensed representation of Boolean data for the approximation of frequency queries. Data Mining Knowl. Disc. 7, 1, 5--22. Google ScholarDigital Library
- Burdick, D., Calimlim, M., and Gehrke, J. 2001. MAFIA: a maximal frequent itemset algorithm for transactional databases. In Proceedings of the International Conference on Data Engineering (ICDE). IEEE Computer Society, 443--452. Google ScholarDigital Library
- Calders, T. and Goethals, B. 2002. Mining all non-derivable frequent itemsets. In Proceedings of the European Conference on the Principles and Practice of Knowledge Discovery in Databases (PKDD). Lecture Notes in Computer Science, vol. 2431. Springer Verlag, 74--85. Google ScholarDigital Library
- Calders, T., Rigotti, C., and Boulicaut, J.-F. 2006. A survey on condensed representations for frequent sets. In Proceedings of Constraint-Based Mining and Inductive Databases: European Workshop on Inductive Databases and Constraint Based Mining. Lecture Notes in Computer Science, vol. 3848. Springer Verlag, 64--80. Google ScholarDigital Library
- Chiu, D.-Y., Wu, Y.-H., and Chen, A. L. P. 2004. An efficient algorithm for mining frequent sequences by a new strategy without support counting. In Proceedings of the International Conference on Data Engineering (ICDE). IEEE Computer Society, 375--386. Google ScholarDigital Library
- de Amo, S., Furtado, D. A., Giacometti, A., and Laurent, D. 2004. An a priori-based approach for first-order temporal pattern mining. In Proceedings of the Simpósio Brasileiro de Bancos de Dados. 48--62.Google Scholar
- Dietterich, T. and Michalski, R. 1985. Discovering patterns in sequences of events. Artif. Intell. 25, 2, 187--232. Google ScholarDigital Library
- Han, J. and Fu, Y. 1999. Mining multiple-level association rules in large databases. IEEE Trans. Knowl. Data Engin. 11, 5, 798--804. Google ScholarDigital Library
- Inmon, W. 2003. Building the Data Warehouse. John Wiley and Sons. Google ScholarDigital Library
- Mannila, H. and Toivonen, H. 1996. Multiple uses of frequent sets and condensed representations. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD). AAAI Press, 189--194.Google Scholar
- Mannila, H., Toivonen, H., and Verkamo, A. 1995. Discovering frequent episodes in sequences. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD). AAAI Press, 210--215.Google Scholar
- Masseglia, F., Cathala, F., and Poncelet, P. 1998. The PSP approach for mining sequential patterns. In Proceedings of the Principles and Practice of Knowledge Discovery in Databases (PKDD). Lecture Notes in Computer Science, vol. 1510. Springer Verlag, 176--184. Google ScholarDigital Library
- Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. 1999a. Discovering frequent closed itemsets for association rules. In Proceedings of the International Conference on Database Theory (ICDT). Lecture Notes in Computer Science, vol. 1540. Springer Verlag, 398--416. Google ScholarDigital Library
- Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. 1999b. Efficient mining of association rules using closed itemset lattices. Inform. Syst. 24, 1, 25--46. Google ScholarDigital Library
- Pei, J., Han, J., and Mao, R. 2000. Closet: An efficient algorithm for mining frequent closed itemsets. In Proceedings of the SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. ACM Press, 21--30.Google Scholar
- Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.-C. 2004. Mining sequential patterns by pattern-growth: The prefixspan approach. IEEE Trans. Knowl. Data Engin. 16, 11, 1424--1440. Google ScholarDigital Library
- Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., and Dayal, U. 2001. Multi-dimensional sequential pattern mining. In Proceedings of the International Conference on Information and Knowledge Management (CIKM). ACM Press, 81--88. Google ScholarDigital Library
- Plantevit, M., Choong, Y. W., Laurent, A., Laurent, D., and Teisseire, M. 2005. M2SP: Mining sequential patterns among several dimensions. In Proceedings of the European Conference on the Principles and Practice of Knowledge Discovery in Databases (PKDD). Lecture Notes in Artificial Intelligence, vol. 3721. Springer Verlag, 205--216.Google Scholar
- Plantevit, M., Laurent, A., and Teisseire, M. 2006. HYPE: mining hierarchical sequential patterns. In Proceedings of the International Workshop on Data Warehousing and OLAP (DOLAP). ACM Press, 19--26. Google ScholarDigital Library
- Rashad, S., Kantardzic, M. M., and Kumar, A. 2007. MSP-CACRR: multidimensional sequential patterns-based call admission control and resource reservation for next-generation wireless cellular networks. In Proceedings of the Symposium on Computational Intelligence and Data Mining (CIDM). IEEE Computer Society, 552--559.Google Scholar
- Srikant, R. and Agrawal, R. 1996. Mining sequential patterns: generalizations and performance improvements. In Proceedings of the International Conference on Extending Data Base Technology (EDBT). Lecture Notes in Computer Science, vol. 1057. Springer Verlag, 3--17. Google ScholarDigital Library
- Stefanowski, J. 2007. Algorithms for context based sequential pattern mining. Fundamenta Informaticae 76, 4, 495--510. Google ScholarDigital Library
- Stefanowski, J. and Ziembinski, R. 2005. Mining context based sequential patterns. In Proceedings of the Atlantic Web Intelligence Conference (AWIC). Lecture Notes in Computer Science, vol. 3528. Springer Verlag, 401--407. Google ScholarDigital Library
- Yang, Z., Kitsuregawa, M., and Wang, Y. 2006. Paid: mining sequential patterns by passed item deduction in large databases. In Proceedings of the International Database Engineering and Applications Symposium (IDEAS). IEEE Computer Society, 113--120. Google ScholarDigital Library
- Yu, C.-C. and Chen, Y.-L. 2005. Mining sequential patterns from multidimensional sequence data. IEEE Trans. Knowl. Data Engin. 17, 1, 136--140. Google ScholarDigital Library
- Zaki, M. J. 2001. SPADE: an efficient algorithm for mining frequent sequences. Machine Learn. J. Special Issue on Unsupervised Learning 42, 1/2, 31--60.Google Scholar
- Zaki, M. J. 2004. Mining non-redundant association rules. Data Mining Knowl. Discov. 9, 223--248. Google ScholarDigital Library
- Zaki, M. J. and Hsiao, C.-J. 2002. CHARM: an efficient algorithm for closed itemset mining. In Proceedings of the SIAM International Conference on Data Mining (SDM). SIAM.Google Scholar
- Zhang, C., Hu, K., Chen, Z., Chen, L., and Dong, Y. 2007. ApproxMGMSP: a scalable method of mining approximate multidimensional sequential patterns on distributed system. In Proceedings of the International Conference on Fuzzy Systems and Knowledge Discovery (FSKD). Vol. 2. IEEE Computer Society, 730--734. Google ScholarDigital Library
Index Terms
- Mining multidimensional and multilevel sequential patterns
Recommendations
Effective database transformation and efficient support computation for mining sequential patterns
AbstractIn this paper, we propose a novel algorithm for mining frequent sequences from transaction databases. The transactions of the same customers form a set of customer sequences. A sequence (an ordered list of itemsets) is frequent if the number of ...
Discovery of frequent DATALOG patterns
Discovery of frequent patterns has been studied in a variety of data mining settings. In its simplest form, known from association rule mining, the task is to discover all frequent itemsets, i.e., all combinations of items that are found in a sufficient ...
Mining Sequential Patterns across Time Sequences
AbstractIn this paper, we deal with mining sequential patterns in multiple time sequences. Building on a state-of-the-art sequential pattern mining algorithm PrefixSpan for mining transaction databases, we propose MILE (MIning in muLtiple sEquences), an ...
Comments