Skip to main content
Erschienen in: Cluster Computing 5/2019

05.02.2018

Map Reduce for big data processing based on traffic aware partition and aggregation

verfasst von: G. Venkatesh, K. Arunesh

Erschienen in: Cluster Computing | Sonderheft 5/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Big data refers to data sets whose volume is 500+ terabytes of data per day. The velocity makes it difficult to capture, manage, process and analyze 2 million records per day. Another characteristics of big data is variability which makes it difficult to identify the reason for losses in i.e., images, audio, video, sensor data and log files etc., Hadoop can be used to analyze this huge amount of data using Hadoop an approximate early result for executing the job partially becomes available for the user even before completion of job which reduce the response time. In Layers 3 Traffic aware clustering programming model is used for processing big data which includes the data processing function map by sort and reducing techniques. The implementation of the layers three traffic aware clustering method will be on the top of Hadoop which is partitioned into HDFS fixed sized blocks and generates intermediate output as a collection of<num, data> pairs. The conventional hash function method is used for partitioning intermediate data among reduced task but it is not traffic efficient. In this paper to reduce network traffic cost, a Map Reduce task is done by designing data partition and aggregator that can reduce task merged traffic from multiple map tasks. The proposed algorithm is more efficient to reduce response time and the simulation results have showed proposal can reduce network traffic.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Vadivel, M.: Enhancing map-reduce framework for big data with hierarchical clustering. Int. J. Innov. Res. Comput. Commun. Eng. 2(Special Issue 1) (2014) Vadivel, M.: Enhancing map-reduce framework for big data with hierarchical clustering. Int. J. Innov. Res. Comput. Commun. Eng. 2(Special Issue 1) (2014)
2.
Zurück zum Zitat Vidya, P.: Big data hadoop: aggregation techniques. Int. J. Sci. Res. (IJSR). ISSN (Online): 2319–7064 (2014) Vidya, P.: Big data hadoop: aggregation techniques. Int. J. Sci. Res. (IJSR). ISSN (Online): 2319–7064 (2014)
3.
Zurück zum Zitat Lena T. Ibrahim, Rosilah Hassan, Ahmad, K., Asat, A.N.: A study on improvement of internet traffic measurement and analysis using Hadoop system. In: The 5th International Conference on Electrical Engineering and Informatics, 10–11 Aug, 2015, Bali, Indonesia (2015) Lena T. Ibrahim, Rosilah Hassan, Ahmad, K., Asat, A.N.: A study on improvement of internet traffic measurement and analysis using Hadoop system. In: The 5th International Conference on Electrical Engineering and Informatics, 10–11 Aug, 2015, Bali, Indonesia (2015)
4.
Zurück zum Zitat Dhanalakshmi, R., Mohamed Jakkariya, S., Mangaiarkarasi, S.: Aggregation methodology on map reduce for big data applications by using traffic-aware partition algorithm. Int. J. Innov. Res. Comput. Commun. Eng. 4(2) (2016) Dhanalakshmi, R., Mohamed Jakkariya, S., Mangaiarkarasi, S.: Aggregation methodology on map reduce for big data applications by using traffic-aware partition algorithm. Int. J. Innov. Res. Comput. Commun. Eng. 4(2) (2016)
5.
Zurück zum Zitat Ahammad Fahad, S.K., Alam, M.M.: A modified K-means algorithm for big data clustering. In: IJCSET April 2016, vol. 6, Issue 4, pp. 129–132 (2016). www.ijcset.net Ahammad Fahad, S.K., Alam, M.M.: A modified K-means algorithm for big data clustering. In: IJCSET April 2016, vol. 6, Issue 4, pp. 129–132 (2016). www.​ijcset.​net
6.
Zurück zum Zitat Abubaker, M., Ashour, W.: Efficient data clustering algorithms: improvements over K means. Int. J. Intell. Syst. Appl. 03, 37–49 (2013) Abubaker, M., Ashour, W.: Efficient data clustering algorithms: improvements over K means. Int. J. Intell. Syst. Appl. 03, 37–49 (2013)
7.
Zurück zum Zitat Shah, N., Mahajan, S.: Document clustering: a detailed review. Int. J. Appl. Inf. Syst. (IJAIS) 4, 30–38 (2012) Shah, N., Mahajan, S.: Document clustering: a detailed review. Int. J. Appl. Inf. Syst. (IJAIS) 4, 30–38 (2012)
8.
Zurück zum Zitat Krishna Mohan, K.V.N., Prem Sai Reddy, K., Geetha Sri, K., Prabhu Deva, A., Sundarababu, M.: Efficient big data processing in Hadoop MapReduce. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 6(3) (2016) Krishna Mohan, K.V.N., Prem Sai Reddy, K., Geetha Sri, K., Prabhu Deva, A., Sundarababu, M.: Efficient big data processing in Hadoop MapReduce. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 6(3) (2016)
9.
Zurück zum Zitat Sripada, S.C., Sreenivasa Rao, M.: Comparison of purity and entropy of K-means clustering and fuzzy C means clustering. Indian J. Comput. Sci. Eng. (IJCSE) 2(3) (2011) Sripada, S.C., Sreenivasa Rao, M.: Comparison of purity and entropy of K-means clustering and fuzzy C means clustering. Indian J. Comput. Sci. Eng. (IJCSE) 2(3) (2011)
11.
Zurück zum Zitat Gawande, P., Shaikh, N.: Improving network traffic in MapReduce for big data applications. In: International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT). IEEE (2016) Gawande, P., Shaikh, N.: Improving network traffic in MapReduce for big data applications. In: International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT). IEEE (2016)
12.
Zurück zum Zitat Suganya, G.: An efficient network traffic classification based on unknown and anomaly flow detection mechanism. Int. J. Comput. Trends Technol. (IJCTT) 10(4) (2014) Suganya, G.: An efficient network traffic classification based on unknown and anomaly flow detection mechanism. Int. J. Comput. Trends Technol. (IJCTT) 10(4) (2014)
13.
Zurück zum Zitat Singh, A., Yadav, A., Rana, A.: K-means with three different distance metrics. Int. J. Comput. Appl. 67(10), 0975–8887 (2013) Singh, A., Yadav, A., Rana, A.: K-means with three different distance metrics. Int. J. Comput. Appl. 67(10), 0975–8887 (2013)
14.
Zurück zum Zitat Li, T., Ma, S., Ogihara, M.: Entropy-based criterion in categorical clustering. In: Proceedings of the 21st International Conference on Machine Learning, Banff, Canada (2004) Li, T., Ma, S., Ogihara, M.: Entropy-based criterion in categorical clustering. In: Proceedings of the 21st International Conference on Machine Learning, Banff, Canada (2004)
15.
Zurück zum Zitat Shim, K: Map reduce algorithms for big data analysis. In: The 38th International Conference on Very Large Data Bases, August 27th 31\(^{st}\) 2012, Istanbul, Turkey, Proceedings of the VLDB Endowment, vol. 5(12) (2012) Shim, K: Map reduce algorithms for big data analysis. In: The 38th International Conference on Very Large Data Bases, August 27th 31\(^{st}\) 2012, Istanbul, Turkey, Proceedings of the VLDB Endowment, vol. 5(12) (2012)
16.
Zurück zum Zitat Vijayalakshmi, G.: Large scale optimization to minimize network traffic using map reduce in big data applications. In: 2016 International Conference on Computation of Power, Energy Information and Communication (ICCPEIC) Vijayalakshmi, G.: Large scale optimization to minimize network traffic using map reduce in big data applications. In: 2016 International Conference on Computation of Power, Energy Information and Communication (ICCPEIC)
18.
Zurück zum Zitat Neelakandan, S., Divyabharathi, S., Rahini, S.: Large scale optimization to minimize network traffic using mapreduce in big data applications. In: 2016 International Conference on Computation of Power, Energy Information and Communication (ICCPEIC) Neelakandan, S., Divyabharathi, S., Rahini, S.: Large scale optimization to minimize network traffic using mapreduce in big data applications. In: 2016 International Conference on Computation of Power, Energy Information and Communication (ICCPEIC)
23.
Zurück zum Zitat Ping, Z.H.O.U., Jingsheng, L.E.I., Wenjun, Y.E.: Large-scale data sets clustering based on MapReduce and Hadoop. J. Comput. Inf. Syst. 7(16), 5956–5963 (2011) Ping, Z.H.O.U., Jingsheng, L.E.I., Wenjun, Y.E.: Large-scale data sets clustering based on MapReduce and Hadoop. J. Comput. Inf. Syst. 7(16), 5956–5963 (2011)
24.
Zurück zum Zitat Steinbach, M., Karypis, G., Kumar, V: A comparison of document clustering techniques. KDD workshop on text mining, vol. 400(1) (2000) Steinbach, M., Karypis, G., Kumar, V: A comparison of document clustering techniques. KDD workshop on text mining, vol. 400(1) (2000)
Metadaten
Titel
Map Reduce for big data processing based on traffic aware partition and aggregation
verfasst von
G. Venkatesh
K. Arunesh
Publikationsdatum
05.02.2018
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe Sonderheft 5/2019
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-018-1799-6

Weitere Artikel der Sonderheft 5/2019

Cluster Computing 5/2019 Zur Ausgabe