Skip to main content

2016 | OriginalPaper | Buchkapitel

A Hybrid Clustering Technique to Improve Big Data Accessibility Based on Machine Learning Approaches

verfasst von : E. Omid Mahdi Ebadati, Mohammad Mortazavi Tabrizi

Erschienen in: Information Systems Design and Intelligent Applications

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Big data is called to a large or complex data from traditional ones, which is unstructured in many case. Accessing to a specific value in a huge data that is not sorted or organized can be time consuming and require a high processing. With growing of data, clustering can be a most important unsupervised approach that finds a structure for data. In this paper, we demonstrate two approaches to cluster data with high accuracy, and then we sort data by implementing merge sort algorithm finally, we use binary search to find a data value point in a specific range of data. This research presents a high value efficiency combo method in big data by using genetic and k-means. After clustering with k-means total sum of the Euclidean distances is 3.37233e+09 for 4 clusters, and after genetic algorithm this number reduce to 0.0300344 in the best fit. In the second and third stage we show that after this implementation, we can access to a particular data much faster and accurate than other older methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Tian, W.D. and Y.D. Zhao, Optimized Cloud Resource Management and Scheduling: Theories and Practices. 2014: Morgan Kaufmann. Tian, W.D. and Y.D. Zhao, Optimized Cloud Resource Management and Scheduling: Theories and Practices. 2014: Morgan Kaufmann.
2.
Zurück zum Zitat Gupta, R., H. Gupta, and M. Mohania, Cloud Computing and Big Data Analytics: What Is New from Databases Perspective?, in Big Data Analytics. 2012, Springer. p. 42–61. Gupta, R., H. Gupta, and M. Mohania, Cloud Computing and Big Data Analytics: What Is New from Databases Perspective?, in Big Data Analytics. 2012, Springer. p. 42–61.
3.
Zurück zum Zitat Hashem, I.A.T., et al., The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 2015. 47: p. 98–115. Hashem, I.A.T., et al., The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 2015. 47: p. 98–115.
4.
Zurück zum Zitat Fadiya, S.O., S. Saydam, and V.V. Zira, Advancing big data for humanitarian needs. Procedia Engineering, 2014. 78: p. 88–95. Fadiya, S.O., S. Saydam, and V.V. Zira, Advancing big data for humanitarian needs. Procedia Engineering, 2014. 78: p. 88–95.
5.
Zurück zum Zitat Young, S.D., A “big data” approach to HIV epidemiology and prevention. Preventive medicine, 2015. 70: p. 17–18. Young, S.D., A “big data” approach to HIV epidemiology and prevention. Preventive medicine, 2015. 70: p. 17–18.
6.
Zurück zum Zitat Liu, Z.-g., et al., Credal c-means clustering method based on belief functions. Knowledge-Based Systems, 2015. 74: p. 119–132. Liu, Z.-g., et al., Credal c-means clustering method based on belief functions. Knowledge-Based Systems, 2015. 74: p. 119–132.
7.
Zurück zum Zitat Jain, A.K., Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 2010. 31(8): p. 651–666. Jain, A.K., Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 2010. 31(8): p. 651–666.
8.
Zurück zum Zitat Ebadati E, O.M. and S. Babaie, Implementation of Two Stages k-Means Algorithm to Apply a Payment System Provider Framework in Banking Systems, in Artificial Intelligence Perspectives and Applications, R. Silhavy, et al., Editors. 2015, Springer International Publishing. p. 203–213. Ebadati E, O.M. and S. Babaie, Implementation of Two Stages k-Means Algorithm to Apply a Payment System Provider Framework in Banking Systems, in Artificial Intelligence Perspectives and Applications, R. Silhavy, et al., Editors. 2015, Springer International Publishing. p. 203–213.
9.
Zurück zum Zitat Liu, Y., X. Wu, and Y. Shen, Automatic clustering using genetic algorithms. Applied Mathematics and Computation, 2011. 218(4): p. 1267–1279. Liu, Y., X. Wu, and Y. Shen, Automatic clustering using genetic algorithms. Applied Mathematics and Computation, 2011. 218(4): p. 1267–1279.
10.
Zurück zum Zitat Razavi, S., et al., An Efficient Grouping Genetic Algorithm for Data Clustering and Big Data Analysis, in Computational Intelligence for Big Data Analysis, Springer International Publishing. 2015, p. 119–142. Razavi, S., et al., An Efficient Grouping Genetic Algorithm for Data Clustering and Big Data Analysis, in Computational Intelligence for Big Data Analysis, Springer International Publishing. 2015, p. 119–142.
11.
Zurück zum Zitat Ebadati E., O.M., et al., Impact of genetic algorithm for meta-heuristic methods to solve multi depot vehicle routing problems with time windows. Ciencia e Tecnica, A Science and Technology, 2014. 29(7): p. 9. Ebadati E., O.M., et al., Impact of genetic algorithm for meta-heuristic methods to solve multi depot vehicle routing problems with time windows. Ciencia e Tecnica, A Science and Technology, 2014. 29(7): p. 9.
12.
Zurück zum Zitat Barthélemy, J.-P. and F. Brucker, Binary clustering. Discrete Applied Mathematics, 2008. 156(8): p. 1237–1250. Barthélemy, J.-P. and F. Brucker, Binary clustering. Discrete Applied Mathematics, 2008. 156(8): p. 1237–1250.
13.
Zurück zum Zitat Alzate, C. and J.A. Suykens, Hierarchical kernel spectral clustering. Neural Networks, 2012. 35: p. 21–30. Alzate, C. and J.A. Suykens, Hierarchical kernel spectral clustering. Neural Networks, 2012. 35: p. 21–30.
14.
Zurück zum Zitat Rahman, M.A. and M.Z. Islam, A hybrid clustering technique combining a novel genetic algorithm with K-Means. Knowledge-Based Systems, 2014. 71: p. 345–365. Rahman, M.A. and M.Z. Islam, A hybrid clustering technique combining a novel genetic algorithm with K-Means. Knowledge-Based Systems, 2014. 71: p. 345–365.
15.
Zurück zum Zitat Villalba, L.J.G., A.L.S. Orozco, and J.R. Corripio, Smartphone image clustering. Expert Systems with Applications, 2015. 42(4): p. 1927–1940. Villalba, L.J.G., A.L.S. Orozco, and J.R. Corripio, Smartphone image clustering. Expert Systems with Applications, 2015. 42(4): p. 1927–1940.
16.
Zurück zum Zitat Yu, J., et al., Image clustering based on sparse patch alignment framework. Pattern Recognition, 2014. Yu, J., et al., Image clustering based on sparse patch alignment framework. Pattern Recognition, 2014.
17.
Zurück zum Zitat Adhau, S., R. Moharil, and P. Adhau, K-Means clustering technique applied to availability of micro hydro power. Sustainable Energy Technologies and Assessments, 2014. 8: p. 191–201. Adhau, S., R. Moharil, and P. Adhau, K-Means clustering technique applied to availability of micro hydro power. Sustainable Energy Technologies and Assessments, 2014. 8: p. 191–201.
18.
Zurück zum Zitat Pavithra, M. and V.M. Aradhya, A comprehensive of transforms, Gabor filter and k-means clustering for text detection in images and video. Applied Computing and Informatics, 2014. Pavithra, M. and V.M. Aradhya, A comprehensive of transforms, Gabor filter and k-means clustering for text detection in images and video. Applied Computing and Informatics, 2014.
19.
Zurück zum Zitat Yao, M., D. Pi, and X. Cong, Chinese text clustering algorithm based k-means. Physics Procedia, 2012. 33: p. 301–307. Yao, M., D. Pi, and X. Cong, Chinese text clustering algorithm based k-means. Physics Procedia, 2012. 33: p. 301–307.
20.
Zurück zum Zitat Lipschutz, S., Data Structures With C (Sie) (Sos). Vol. 4.19–4.27. McGraw-Hill Education (India) Pvt Limited. Lipschutz, S., Data Structures With C (Sie) (Sos). Vol. 4.19–4.27. McGraw-Hill Education (India) Pvt Limited.
21.
Zurück zum Zitat Hatamlou, A., In search of optimal centroids on data clustering using a binary search algorithm. Pattern Recognition Letters, 2012. 33(13): p. 1756–1760. Hatamlou, A., In search of optimal centroids on data clustering using a binary search algorithm. Pattern Recognition Letters, 2012. 33(13): p. 1756–1760.
Metadaten
Titel
A Hybrid Clustering Technique to Improve Big Data Accessibility Based on Machine Learning Approaches
verfasst von
E. Omid Mahdi Ebadati
Mohammad Mortazavi Tabrizi
Copyright-Jahr
2016
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-2755-7_43

Premium Partner