Skip to main content

2018 | OriginalPaper | Buchkapitel

A Novel Approach of Frequent Itemset Mining Using HDFS Framework

verfasst von : Prajakta G. Kulkarni, S. R. Khonde

Erschienen in: Intelligent Computing and Information and Communication

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Frequent itemset extraction is a very important task in data mining applications. This is useful in applications like Association rule mining and co-relations. They are using some algorithms to extract the frequent itemsets, like Apriori and FP-Growth. The algorithms used by these applications are inefficient to support balancing, distributing the load, and automatic parallelization with good speed. Data partitioning and fault tolerance is also not possible because of excessive data. Hence, there is a need to develop algorithms which will remove these issues. Here, a novel approach is used to work on the extracting the frequent itemsets using MapReduce. This system is based on the Modified Apriori, called as Frequent Itemset Mining using Modified Apriori(FIMMA). To automate the data parallelization, well balance the load and to reduce the execution time FIMMA works concurrently and independently using three mappers. It uses decomposing strategy to work concurrently.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bechini, Alessio, Francesco Marcelloni, and Armando Segatori. “A MapReduce solution for associative classification of big data”, Information Sciences, 2016. Bechini, Alessio, Francesco Marcelloni, and Armando Segatori. “A MapReduce solution for associative classification of big data”, Information Sciences, 2016.
2.
Zurück zum Zitat X Zhou, Y Huang - Fuzzy Systems and Knowledge Discovery. An Improved Parallel Association Rules Algorithm Based on MapReduce Framework for Big Data”, pp. 284–288, 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery. X Zhou, Y Huang - Fuzzy Systems and Knowledge Discovery. An Improved Parallel Association Rules Algorithm Based on MapReduce Framework for Big Data”, pp. 284–288, 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery.
3.
Zurück zum Zitat Yaling Xun, Jifu Zhang, and Xiao Qin, FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce” IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 46, NO. 3, pp. 313–325, MARCH 2016. Yaling Xun, Jifu Zhang, and Xiao Qin, FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce” IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 46, NO. 3, pp. 313–325, MARCH 2016.
4.
Zurück zum Zitat R. Agrawal, T. Imieli nski, and A. Swami, “Mining association rules between sets of items in large databases,” ACM SIGMOD Rec., vol.22, no. 2, pp. 207–216, 1993. R. Agrawal, T. Imieli nski, and A. Swami, “Mining association rules between sets of items in large databases,” ACM SIGMOD Rec., vol.22, no. 2, pp. 207–216, 1993.
6.
Zurück zum Zitat M.-Y. Lin, P.-Y. Lee, and S.-C. Hsueh, “Apriori-based frequent itemset mining algorithms on MapReduce,” in Proc. 6th Int. Conf. Ubiquit. Inf. Manage. Common. (ICUIMC), Danang, Vietnam, 2012, pp. 76:1–76:8. M.-Y. Lin, P.-Y. Lee, and S.-C. Hsueh, “Apriori-based frequent itemset mining algorithms on MapReduce,” in Proc. 6th Int. Conf. Ubiquit. Inf. Manage. Common. (ICUIMC), Danang, Vietnam, 2012, pp. 76:1–76:8.
7.
Zurück zum Zitat L. Zhou et al., “Balanced parallel FP-growth with MapReduce,” in Proc. IEEE Youth Conf. Inf. Compute. Telecommun. (YC-ICT), Beijing, China, 2010, pp. 243–246. L. Zhou et al., “Balanced parallel FP-growth with MapReduce,” in Proc. IEEE Youth Conf. Inf. Compute. Telecommun. (YC-ICT), Beijing, China, 2010, pp. 243–246.
8.
Zurück zum Zitat Y.-J. Tsay, T.-J. Hsu, and J.-R. Yu, “FIUT: A new method for mining frequent itemsets,” Inf. Sci., vol. 179, no. 11, pp. 1724–1737, 2009. Y.-J. Tsay, T.-J. Hsu, and J.-R. Yu, “FIUT: A new method for mining frequent itemsets,” Inf. Sci., vol. 179, no. 11, pp. 1724–1737, 2009.
9.
Zurück zum Zitat Kiran Chavan, Priyanka Kulkarni, Pooja Ghodekar, S. N. Patil, Frequent itemset mining for Big data”, IEEE, Green Computing and Internet of Things (ICGCIoT), pp. 1365–1368, 2015. Kiran Chavan, Priyanka Kulkarni, Pooja Ghodekar, S. N. Patil, Frequent itemset mining for Big data”, IEEE, Green Computing and Internet of Things (ICGCIoT), pp. 1365–1368, 2015.
10.
Zurück zum Zitat M. Riondato, J. A. DeBrabant, R. Fonseca, and E. Upfal, “PARMA: A parallel randomized algorithm for approximate association rules mining in MapReduce,” in Proc. 21st ACM Int. Conf. Inf. Knowl. Manage.,Maui, HI, USA, pp. 85–94, 2012. M. Riondato, J. A. DeBrabant, R. Fonseca, and E. Upfal, “PARMA: A parallel randomized algorithm for approximate association rules mining in MapReduce,” in Proc. 21st ACM Int. Conf. Inf. Knowl. Manage.,Maui, HI, USA, pp. 85–94, 2012.
11.
Zurück zum Zitat Wei Lu, Yanyan Shen, Su Chen, Beng Chin Ooi, “Efficient Processing of kNearest Neighbor Joins using MapReduce” 2012 VLDB Endowment 2150-8097/12/06, Vol. 5, No. 10, pp. 1016–1027. Wei Lu, Yanyan Shen, Su Chen, Beng Chin Ooi, “Efficient Processing of kNearest Neighbor Joins using MapReduce” 2012 VLDB Endowment 2150-8097/12/06, Vol. 5, No. 10, pp. 1016–1027.
13.
Zurück zum Zitat Shekhar Gupta, Christian Fritz, Johan de Kleer, and Cees Witteveen, “Diagnosing Heterogeneous Hadoop Clusters”, 2012, 23 rd International Workshop on Principles of Diagnosis. Shekhar Gupta, Christian Fritz, Johan de Kleer, and Cees Witteveen, “Diagnosing Heterogeneous Hadoop Clusters”, 2012, 23 rd International Workshop on Principles of Diagnosis.
14.
Zurück zum Zitat J. Dean and S. Ghemawat, “MapReduce: A flexible data processing tool,” Commun. ACM, vol. 53, no. 1, pp. 72–77, Jan. 2010. J. Dean and S. Ghemawat, “MapReduce: A flexible data processing tool,” Commun. ACM, vol. 53, no. 1, pp. 72–77, Jan. 2010.
15.
Zurück zum Zitat Yaling Xun, Jifu Zhang, Xiao Qin and Xujun Zhao, “FiDoop-Dp Data Partitioning in Frequent Itemset Mining on Hadoop clusters”, VOL. 28, NO. 1, pp. 101–113, 2017. Yaling Xun, Jifu Zhang, Xiao Qin and Xujun Zhao, “FiDoop-Dp Data Partitioning in Frequent Itemset Mining on Hadoop clusters”, VOL. 28, NO. 1, pp. 101–113, 2017.
16.
Zurück zum Zitat Ramakrishnudu, T, and R B V Subramanyam. Mining Interesting Infrequent Itemsets from Very Large Data based on MapReduce Framework”, International Journal of Intelligent Systems and Applications, Vol. 7, No. 7, pp. 44–49, 2015. Ramakrishnudu, T, and R B V Subramanyam. Mining Interesting Infrequent Itemsets from Very Large Data based on MapReduce Framework”, International Journal of Intelligent Systems and Applications, Vol. 7, No. 7, pp. 44–49, 2015.
17.
Zurück zum Zitat Jong So Park, Ming Syan Chen, Philip S, “An Effective Hash based Algorithm for mining Association rule”, ‘95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data, held on May 22–25, 1995, Pages 175–186. Jong So Park, Ming Syan Chen, Philip S, “An Effective Hash based Algorithm for mining Association rule”, ‘95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data, held on May 22–25, 1995, Pages 175–186.
Metadaten
Titel
A Novel Approach of Frequent Itemset Mining Using HDFS Framework
verfasst von
Prajakta G. Kulkarni
S. R. Khonde
Copyright-Jahr
2018
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7245-1_43

Premium Partner