nach oben

Cluster Computing

Erschienen in:

10.07.2017

Making a case for the on-demand multiple distributed message queue system in a Hadoop cluster

verfasst von: Cao Ngoc Nguyen, Soonwook Hwang, Jik-Soo Kim

Erschienen in: Cluster Computing | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this paper, we present a framework that can provide users with a simple, convenient and powerful way to deploy multiple message queue system on demand in a Hadoop cluster. Specifically, we are leveraging the Apache Kafka which is one of the state of art distributed message queue systems that can achieve high throughput, low latency, and good load balancing. Our framework provides automation of setting up and starting Kafka brokers on the fly and users can leverage the framework to quickly adopt Kafka without spending much efforts on installation and configuration challenges. In addition, the framework supports users to run their Kafka-based applications without detailed knowledge about the Hadoop YARN APIs and underlying mechanisms. We present a use case of the framework to evaluate Kafka’s performance with various test cases and working scenarios. The experimental results allow Kafka’s potential users to perceive the influences of different settings on the queuing performance.

Vorheriger Artikel A network aware approach for the scheduling of virtual machine migration during peak loads

Nächster Artikel VNF-EQ: dynamic placement of virtual network functions for energy efficiency and QoS guarantee in NFV

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Apache Kafka: A high-throughput distributed messaging system. http://kafka.apache.org/ (2017). Accessed 8 July 2017

Apache Kafka use cases. https://kafka.apache.org/uses (2017). Accessed 8 July 2017

Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (2008)CrossRef

He, C., Weitzel, D., Swanson, D., Lu, Y.: HOG: distributed Hadoop MapReduce on the grid. In: Proceedings of the 5th Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) 2012 in conjunction with SC12 (2012)

Hintjens, P.: ZeroMQ: Messaging for Many Applications. O’Reilly Media, Inc., Newton (2013)

Introducing KOYA Apache Kafka on YARN. https://www.datatorrent.com/blog/introducing-koya-apache-kafka-on-yarn/ (2017). Accessed 8 July 2017

Kim, J.S., Nguyen, C., Hwang, S.: MOHA: many-task computing meets the big data platform. In: IEEE 12th International Conference on eScience (eScience 2016) (2016)

Kreps, J., Narkhede, N., Rao, J., et al.: Kafka: a distributed messaging system for log processing. In: Proceedings of the NetDB (2011)

Liu, G., Wood, T.: Cloud-scale application performance monitoring with SDN and NFV. In: 2015 IEEE International Conference on Cloud Engineering (IC2E), pp. 440–445. IEEE, New York (2015)

10.

Lu, X., Liang, F., Wang, B., Zha, L., Xu, Z.: DataMPI: extending MPI to Hadoop-like big data computing. In: Proceedings of the 28th IEEE International Parallel and Distributed Processing Symposium (IPDPS ’14) (2014)

11.

Murthy, A., Vavilapalli, V., Eadline, D., Niemiec, J., Markham, J.: Apache Hadoop YARN: Moving Beyond MapReduce and Batch Processing with Apache Hadoop 2. Addison-Wesley Data & Analytics, New York (2014)

12.

Murthy, A.C., Vavilapalli, V.K., Eadline, D., Niemiec, J., Markham, J.: Apache Hadoop YARN: Moving Beyond MapReduce and Batch Processing with Apache Hadoop 2. Pearson Education, Upper Saddle River (2013)

13.

Nannoni, N.: Message-oriented middleware for scalable data analytics architectures. Master’s thesis, KTH—Information and Communication Technology School (2015)

14.

Nguyen, C., Kim, J.S., Hwang, S.: KOHA: building a Kafka-based distributed queue system on the fly in a Hadoop cluster. In: 2016 IEEE 1st International Workshops on Foundations and Applications of Self-* Systems (2016)

15.

Preuveneers, D., Berbers, Y., Joosen Samurai, W.: A batch and streaming context architecture for large-scale intelligent applications and environments. J. Ambient Intell. Smart Environ. 8(1), 63–78 (2016)CrossRef

16.

Raicu, I., Foster, I., Wilde, M., Zhang, Z., Iskra, K., Beckman, P., Zhao, Y., Szalay, A., Choudhary, A., Little, P., et al.: Middleware support for many-task computing. Cluster Comput. 13(3), 291–314 (2010)CrossRef

17.

Raicu, I., Foster, I., Zhao, Y.: Many-task computing for grids and supercomputers. In: Proceedings of the Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS’08) (2008)

18.

Richardson, A., et al.: Introduction to RabbitMQ—An Open Source Message Broker That Just Works. Google, London (2008)

19.

Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST’10) (2010)

20.

Snyder, B., Bosanac, D., Davies, R.: Introduction to apache activeMQ. In: ActiveMQ in Action, pp. 6–16

21.

The Apache Hadoop Project: Open-source software for reliable, scalable, distributed computing. http://hadoop.apache.org/ (2017). Accessed 8 July 2017

22.

Vavilapalli, V.K., Murthy, A.C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., Saha, B., Curino, C., O’Malley, O., Radia, S., Reed, B., Baldeschwieler, E.: Apache Hadoop YARN: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing (SoCC’13) (2013)

23.

Xu, L., Li, M., Butt, A.R.: GERBIL: MPI+YARN. In: Proceedings of the 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) (2015)

24.

Ye, J., Chow, J.H., Chen, J., Zheng, Z.: Stochastic gradient boosted distributed decision trees. In: Proceedings of the 18th ACM conference on Information and knowledge management (CIKM’09) (2009)

25.

Zookeeper: A centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. https://zookeeper.apache.org/ (2017). Accessed 8 July 2017

Titel: Making a case for the on-demand multiple distributed message queue system in a Hadoop cluster
verfasst von: Cao Ngoc Nguyen
Soonwook Hwang
Jik-Soo Kim
Publikationsdatum: 10.07.2017
Verlag: Springer US
Erschienen in: Cluster Computing / Ausgabe 3/2017
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI: https://doi.org/10.1007/s10586-017-1031-0

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2017

AutoMigrate: a framework for developing intelligent, self-managing cloud services with maximum availability

Enhanced speaker verification using an adaptive multiple low-rank representation based on the modified adaptive Gaussian mixture model framework

A big data analytics approach to combat telecommunication vulnerabilities

A performance comparison of data and memory allocation strategies for sequence aligners on NUMA architectures

Towards secure and flexible EHR sharing in mobile health cloud under static assumptions

Identifying opinion leaders in social networks with topic limitation