ABSTRACT
Data mining and analytics aims to analyze valuable data and extract implicit, previously unknown, and potentially useful information from the data. Due to advances in technology, high volumes of valuable data are generated at a high velocity in high varieties of data sources in various real-life business, scientific and engineering applications. Due to their high volumes, the quality and accuracy of these data depend on their veracity (uncertainty of data). This leads us into the new era of Big Data. This paper presents some works on big data mining and computing, especially on an important task of frequent pattern mining, which computes and mines from big data for interesting knowledge in the forms of frequently occurring sets of merchandise items in shopping markets, interesting co-located events, and/or popular individuals in social networks. The paper also shows how big data mining contributes to real-life applications and services.
- R. Agrawal, T. Imieliński & A. Swami A. Mining association rules between sets of items in large databases. In Proc. ACM SIGMOD 1993, pp. 207--216. Google ScholarDigital Library
- R. Agrawal & J.C. Shafer. Parallel mining of association rules. IEEE TKDE, 8(6): 962--969 (1996) Google ScholarDigital Library
- R. Agrawal & R. Srikant. Fast algorithms for mining association rules in large databases. In Proc. VLDB 1994, pp. 487--499. Google ScholarDigital Library
- A. Ceglar & J.F. Roddick. Association mining. ACM Computing Surveys, 38(2): art. 5 (2006) Google ScholarDigital Library
- J. Dean & S. Ghemawat, S. MapReduce: a flexible data processing tool. CACM, 53(1): 72--77 (2010) Google ScholarDigital Library
- J. Han, J. Pei & Y. Yin. Mining frequent patterns without candidate generation. In Proc. ACM SIGMOD 2000, pp. 1--12. Google ScholarDigital Library
- J. Hipp, U. Güntzer, & G. Nakhaeizadeh. Algorithms for association rule mining -- a general survey and comparison. ACM SIGKDD Explorations, 2(1): 58--64 (2000) Google ScholarDigital Library
- F. Jiang, K. Kawagoe & C.K. Leung. Big social network mining for "following" patterns. In Proc. C3S2E 2015, pp. 28--37. Google ScholarDigital Library
- F. Jiang & C.K. Leung. A business intelligence solution for frequent pattern mining on social networks. In Proc. IEEE ICDM Workshops 2014, pp. 789--796.Google ScholarCross Ref
- F. Jiang & C.K. Leung. A data analytic algorithm for managing, querying, and processing uncertain big data in cloud environments. Algorithms, 8(4): 1175--1194 (2015)Google ScholarCross Ref
- F. Jiang, C.K. Leung & R.K. MacKinnon. BigSAM: mining interesting patterns from probabilistic databases of uncertain big data. In Proc. PAKDD Workshops 2014, pp. 780--792.Google ScholarCross Ref
- Y. Kim, K. Shim, M.-S. Kim & J.S. Lee. DBCURE-MR: an efficient density-based clustering algorithm for large data using MapReduce. Information. Systems, 42: 15--35 (2014) Google ScholarDigital Library
- H. Lee, S.-T. Hong, H.J. Kim & J.-W. Chang. A travel time prediction algorithm using rule-based classification on MapReduce. In Proc. DEXA 2015, Part II, pp. 440--452.Google ScholarCross Ref
- C.K. Leung. Constraint-based association rule mining. In Encyclopedia of data warehousing and mining, 2nd ed., pp. 307--312 (2009)Google Scholar
- C.K. Leung. Mining frequent itemsets from probabilistic datasets. In Proc. EDB 2013, pp. 137--148.Google Scholar
- C.K. Leung. Uncertain frequent pattern mining. In C.C. Aggarwal & J. Han (eds.), Frequent pattern mining, pp. 339--367 (2014)Google Scholar
- C.K. Leung. A. Cuzzocrea & F. Jiang. Discovering frequent patterns from uncertain data streams with time-fading and landmark models. Transactions on Large-Scale Data- and Knowledge-Centered Systems, 8: 174--196 (2013)Google Scholar
- C.K. Leung & Y. Hayduk. Mining frequent patterns from uncertain data with MapReduce for big data analytics. In Proc. DASFAA 2013, Part I, pp. 440--455.Google ScholarCross Ref
- C.K. Leung & F. Jiang. A data science solution for mining interesting patterns from uncertain big data. In Proc. IEEE BDCloud 2014, pp. 235--242. Google ScholarDigital Library
- C.K. Leung, M.A.F. Mateo & D.A. Brajczuk. A tree-based approach for frequent pattern mining from uncertain data. In Proc. PAKDD 2008, pp. 653--661. Google ScholarDigital Library
- C.K. Leung, R.K. MacKinnon & S.K. Tanbeer. Fast algorithms for frequent itemset mining from uncertain data. In Proc. IEEE ICDM 2014, pp. 893--898. Google ScholarDigital Library
- C.K. Leung, I.J.M. Medina & S.K. Tanbeer. Analyzing social networks to mine important friends. In G. Xu & L. Li (eds.), Social media mining and social network analysis: Emerging research, pp. 90--104 (2013)Google Scholar
- H. Li, Y. Wang, D. Zhang, M. Zhang & E.Y. Chang. PFP: parallel FP-growth for query recommendation. In Proc. ACM RecSys 2008, pp. 107--114. Google ScholarDigital Library
- C. Liao & A.C. Squicciarini. Towards provenance-based anomaly detection in MapReduce. In Proc. IEEE/ACM CCGrid 2015, pp. 647--656.Google ScholarDigital Library
- J. Lin & C. Dyer. Data-intensive text processing with MapReduce (2010) Google ScholarDigital Library
- M.-Y. Lin, P.-Y. Lee & S.-C. Hsueh. Apriori-based frequent itemset mining algorithms on MapReduce. In Proc. ICUIMC 2012, art. 76. Google ScholarDigital Library
- S. Madden. From databases to big data. IEEE Internet Computing, 16(3): 4--6 (2012) Google ScholarDigital Library
- J. Pei, J. Han, H. Lu, S. Nishio, S. Tang & D. Yang. H-Mine: hyper-structure mining of frequent patterns in large databases. In Proc. IEEE ICDM 2001, pp. 441--448. Google ScholarDigital Library
- A. Rajaraman & J.D. Ullman. Mining of massive datasets (2011) Google ScholarDigital Library
- A. Savasere, E. Omiecinski & S. Navathe. An efficient algorithm for mining association rules in large databases. In Proc. VLDB 1995, pp. 432--444. Google ScholarDigital Library
- P. Shenoy, J.R. Bhalotia, M. Bawa & D. Shah. Turbo-charging vertical mining of large databases. In Proc. ACM SIGMOD 2000, pp. 22--33. Google ScholarDigital Library
- K. Shim. MapReduce algorithms for big data analysis. PVLDB, 5(12): 2016--2017 (2012) Google ScholarDigital Library
- Y. Tong, L. Chen, Y. Cheng & P.S. Yu. Mining frequent itemsets over uncertain databases. PVLDB, 5(11): 1650--1661 (2012) Google ScholarDigital Library
- J.D. Ullman. A survey of association-rule mining. In Proc. DS 2000, pp. 1--14. Google ScholarDigital Library
- K. Wang, L. Tang, J. Han & J. Liu. Top down FP-growth for association rule mining. In Proc. PAKDD 2002, pp 334--340. Google ScholarDigital Library
- M.J. Zaki. Parallel and distributed association mining: a survey. IEEE Concurrency, 7(4): 14--25 (1999) Google ScholarDigital Library
- M.J. Zaki. Scalable algorithms for association mining. IEEE TKDE, 12(3): 372--390. Google ScholarDigital Library
- Big Data Mining Applications and Services
Recommendations
Mining uncertain data for constrained frequent sets
IDEAS '09: Proceedings of the 2009 International Database Engineering & Applications SymposiumData mining aims to search for implicit, previously unknown, and potentially useful pieces of information---such as sets of items that are frequently co-occurring together---that are embedded in data. The mined frequent sets can be used in the discovery ...
Item-centric mining of frequent patterns from big uncertain data
AbstractHigh volumes of wide varieties of valuable data of different veracity (e.g., imprecise and uncertain data) can be easily generated or collected at a high velocity for various knowledge-based and intelligent information & engineering systems in ...
Scalable Vertical Mining for Big Data Analytics of Frequent Itemsets
Database and Expert Systems ApplicationsAbstractAdvances in technology and the increasing growth of popularity on Internet of Things (IoT) for many applications have produced huge volume of data at a high velocity. These valuable big data can be of a wide variety or different veracity. Embedded ...
Comments