Skip to main content
Top

2016 | OriginalPaper | Chapter

Distributed Lazy Association Classification Algorithm Based on Spark

Authors : Xueming Li, Chaoyang zhang, Guangwei Chen, Xiaoteng Sun, Qi Zhang, Haomin Yang

Published in: Advanced Data Mining and Applications

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The lazy association classification algorithms are inefficient when classifying multiple unclassified samples at the same time. The existing lazy association classification algorithms are sequential which can’t deal with the big data problems. To solve these problems, we propose a distributed lazy association classification algorithm based on Spark, named as SDLAC. Firstly, it clusters the unclassified samples by K-Means algorithm. Secondly, it executes distributed projections according to clustered results, and mines classification association rules by a distributed mining algorithm based on spark. Then it constructs classifier to classify unclassified samples. The experiments are conducted on the 5 UCI datasets and a big dataset from the first national college competition on cloud computing(China). The results show that SDLAC algorithm is more accurate than the CBA algorithm. Besides, its efficiency is far more than the typical distributed lazy association classification algorithm. In other words, the SDLAC algorithm can adapt big data environment.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Thabtah, F.: A review of associative classification mining. Knowl. Eng. Rev. 22(1), 37–65 (2007)CrossRef Thabtah, F.: A review of associative classification mining. Knowl. Eng. Rev. 22(1), 37–65 (2007)CrossRef
2.
go back to reference Veloso, A., Meira, W., Zaki, M.J.: Lazy association classification. In: 6th International Conference on Data Mining, pp. 645–654. IEEE (2006) Veloso, A., Meira, W., Zaki, M.J.: Lazy association classification. In: 6th International Conference on Data Mining, pp. 645–654. IEEE (2006)
3.
go back to reference Neapolitan, R.E.: Learning Bayesian Networks. Prentice Hall, Upper Saddle River (2004) Neapolitan, R.E.: Learning Bayesian Networks. Prentice Hall, Upper Saddle River (2004)
4.
go back to reference Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986) Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
5.
go back to reference Cristianini, N., Shawe, J.: An introduction to Support Vector Machines. In: Cambridge University Press (2000) Cristianini, N., Shawe, J.: An introduction to Support Vector Machines. In: Cambridge University Press (2000)
6.
go back to reference Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceeding of KDD, pp. 80–86 (1998) Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceeding of KDD, pp. 80–86 (1998)
7.
go back to reference Rules, C., Li, W., Han, J., et al.: CMAR: accurate and efficient classification based on multiple class-association rules. In: IEEE International Conference on Data Mining, ICDM, pp. 369–376. IEEE Computer Society (2001) Rules, C., Li, W., Han, J., et al.: CMAR: accurate and efficient classification based on multiple class-association rules. In: IEEE International Conference on Data Mining, ICDM, pp. 369–376. IEEE Computer Society (2001)
8.
go back to reference Xueming, L., Meng, F., Binfei, L.: Associative classification based on hybrid strategy. J. Comput. Appl. (Chinese) 30(3), 724–727 (2013) Xueming, L., Meng, F., Binfei, L.: Associative classification based on hybrid strategy. J. Comput. Appl. (Chinese) 30(3), 724–727 (2013)
9.
go back to reference Xueming, L., Xueming, L., Tao, Y.: Quantitative associative classification based on lazy method. J. Comput. Appl. (Chinese) 33(8), 2184–2187 (2013) Xueming, L., Xueming, L., Tao, Y.: Quantitative associative classification based on lazy method. J. Comput. Appl. (Chinese) 33(8), 2184–2187 (2013)
10.
go back to reference Yanyan, F.: Research on Distributed Mining of Association rules algorithm based on MapReduce. In: Harbin Engineering University (2013) Yanyan, F.: Research on Distributed Mining of Association rules algorithm based on MapReduce. In: Harbin Engineering University (2013)
11.
go back to reference Yue, W.: The Method Research of Mining Association Rules in Distributed Environments. In: Chongqing University (2003) Yue, W.: The Method Research of Mining Association Rules in Distributed Environments. In: Chongqing University (2003)
Metadata
Title
Distributed Lazy Association Classification Algorithm Based on Spark
Authors
Xueming Li
Chaoyang zhang
Guangwei Chen
Xiaoteng Sun
Qi Zhang
Haomin Yang
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-49586-6_41

Premium Partner