Skip to main content
Erschienen in: International Journal of Parallel Programming 3/2015

01.06.2015

OFScheduler: A Dynamic Network Optimizer for MapReduce in Heterogeneous Cluster

Erschienen in: International Journal of Parallel Programming | Ausgabe 3/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

MapReduce is a popular programming paradigm in cloud computing due to its excellent scalability for processing large-scale data. However, MapReduce performs poorly in heterogeneous clusters. One of the reasons is that Hadoop’s built-in load balancing algorithm for Map function leads to excessive network traffic. We propose a new dynamic network optimizer called OFScheduler for heterogeneous clusters to relieve the network traffic during the execution of MapReduce jobs. The optimizer focuses on reducing bandwith competition, balancing the workload of network links and increasing bandwidth utilization. The proposed optimizer tags different types of traffic and utilize the Openflow to adjust transfers of flows dynamically. We instantiate a simulator and an OpenFlow testbed for evaluation. The simulation results demonstrate that the proposed optimizer has a significant effect on increasing bandwidth utilization and improving the performance of MapReduce by 24 ~ 63 % for most of jobs in a multi-path heterogeneous cluster. The experiment results show that the proposed optimizer can be deployed into a real environment.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Ahmad, F., Chakradhar, S., Raghunathan, A., Vijaykumar, T.: Tarazu: optimizing mapreduce on heterogeneous clusters. In: Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 61–74 (2012) Ahmad, F., Chakradhar, S., Raghunathan, A., Vijaykumar, T.: Tarazu: optimizing mapreduce on heterogeneous clusters. In: Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 61–74 (2012)
4.
Zurück zum Zitat Al-Fares, M., Radhakrishnan, S., Raghavan, B., Huang, N., Vahdat, A.: Hedera: dynamic flow scheduling for data center networks. In: Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, pp. 19–19 (2010) Al-Fares, M., Radhakrishnan, S., Raghavan, B., Huang, N., Vahdat, A.: Hedera: dynamic flow scheduling for data center networks. In: Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, pp. 19–19 (2010)
5.
Zurück zum Zitat Ballani, H., Costa, P., Karagiannis, T., Rowstron, A.: Towards predictable datacenter networks. In: SIGCOMM-Computer Communication Review (2011) Ballani, H., Costa, P., Karagiannis, T., Rowstron, A.: Towards predictable datacenter networks. In: SIGCOMM-Computer Communication Review (2011)
6.
Zurück zum Zitat Chaiken, R., Jenkins, B., Larson, P.Å., Ramsey, B., Shakib, D., Weaver, S., Zhou, J.: Scope: easy and efficient parallel processing of massive data sets. Proc. VLDB Endow. 1(2), 1265–1276 (2008)CrossRef Chaiken, R., Jenkins, B., Larson, P.Å., Ramsey, B., Shakib, D., Weaver, S., Zhou, J.: Scope: easy and efficient parallel processing of massive data sets. Proc. VLDB Endow. 1(2), 1265–1276 (2008)CrossRef
7.
Zurück zum Zitat Chowdhury, M., Zaharia, M., Ma, J., Jordan, M., Stoica, I.: Managing data transfers in computer clusters with orchestra. SIGCOMM-Comput. Commun. Rev. 41(4), 98 (2011)CrossRef Chowdhury, M., Zaharia, M., Ma, J., Jordan, M., Stoica, I.: Managing data transfers in computer clusters with orchestra. SIGCOMM-Comput. Commun. Rev. 41(4), 98 (2011)CrossRef
8.
Zurück zum Zitat Curtis, A., Kim, W., Yalagandula, P.: Mahout: low-overhead datacenter traffic management using end-host-based elephant detection. In: INFOCOM, 2011 Proceedings IEEE, pp. 1629–1637. IEEE (2011) Curtis, A., Kim, W., Yalagandula, P.: Mahout: low-overhead datacenter traffic management using end-host-based elephant detection. In: INFOCOM, 2011 Proceedings IEEE, pp. 1629–1637. IEEE (2011)
9.
Zurück zum Zitat Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef
11.
Zurück zum Zitat Gude, N., Koponen, T., Pettit, J., Pfaff, B., Casado, M., McKeown, N., Shenker, S.: Nox: towards an operating system for networks. ACM SIGCOMM Comput. Commun. Rev. 38(3), 105–110 (2008)CrossRef Gude, N., Koponen, T., Pettit, J., Pfaff, B., Casado, M., McKeown, N., Shenker, S.: Nox: towards an operating system for networks. ACM SIGCOMM Comput. Commun. Rev. 38(3), 105–110 (2008)CrossRef
12.
Zurück zum Zitat Handigol, N., Seetharaman, S., Flajslik, M., McKeown, N., Johari, R.: Plug-n-serve: load-balancing web traffic using openflow. In: ACM SIGCOMM Demo (2009) Handigol, N., Seetharaman, S., Flajslik, M., McKeown, N., Johari, R.: Plug-n-serve: load-balancing web traffic using openflow. In: ACM SIGCOMM Demo (2009)
13.
Zurück zum Zitat Luo, T., Tan, H.P., Quan, P.C., Law, Y.W., Jin, J.: Enhancing responsiveness and scalability for openflow networks via control-message quenching. In: ICT Convergence (ICTC), 2012 International Conference on, pp. 348–353. IEEE (2012) Luo, T., Tan, H.P., Quan, P.C., Law, Y.W., Jin, J.: Enhancing responsiveness and scalability for openflow networks via control-message quenching. In: ICT Convergence (ICTC), 2012 International Conference on, pp. 348–353. IEEE (2012)
14.
Zurück zum Zitat McKeown, N.: Openflow specification v1.0.0 (2008) McKeown, N.: Openflow specification v1.0.0 (2008)
15.
Zurück zum Zitat McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L., Rexford, J., Shenker, S., Turner, J.: Openflow: enabling innovation in campus networks. ACM SIGCOMM Comput. Commun. Rev. 38(2), 69–74 (2008)CrossRef McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L., Rexford, J., Shenker, S., Turner, J.: Openflow: enabling innovation in campus networks. ACM SIGCOMM Comput. Commun. Rev. 38(2), 69–74 (2008)CrossRef
16.
Zurück zum Zitat MM, O., Okamura, K.: Design and implementation of application based routing using openflow. CFI (2010) MM, O., Okamura, K.: Design and implementation of application based routing using openflow. CFI (2010)
17.
Zurück zum Zitat Pfaff, B., Pettit, J., Koponen, T., Amidon, K., Casado, M., Shenker, S.: Extending networking into the virtualization layer. In: Proceedings of the HotNets, (Oct 2009) (2009) Pfaff, B., Pettit, J., Koponen, T., Amidon, K., Casado, M., Shenker, S.: Extending networking into the virtualization layer. In: Proceedings of the HotNets, (Oct 2009) (2009)
18.
Zurück zum Zitat Shieh, A., Kandula, S., Greenberg, A., Kim, C., Saha, B.: Sharing the data center network. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, pp. 23–23. USENIX Association (2011) Shieh, A., Kandula, S., Greenberg, A., Kim, C., Saha, B.: Sharing the data center network. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, pp. 23–23. USENIX Association (2011)
19.
Zurück zum Zitat Tian, C., Zhou, H., He, Y., Zha, L.: A dynamic mapreduce scheduler for heterogeneous workloads. In: Eighth International Conference on Grid and Cooperative Computing, 2009. GCC’09, pp. 218–224. IEEE (2009) Tian, C., Zhou, H., He, Y., Zha, L.: A dynamic mapreduce scheduler for heterogeneous workloads. In: Eighth International Conference on Grid and Cooperative Computing, 2009. GCC’09, pp. 218–224. IEEE (2009)
20.
Zurück zum Zitat Tootoonchian, A., Ganjali, Y.: Hyperflow: A distributed control plane for openflow. In: Proceedings of the 2010 Internet Network Management Conference on Research on Enterprise Networking, pp. 3–3. USENIX Association (2010) Tootoonchian, A., Ganjali, Y.: Hyperflow: A distributed control plane for openflow. In: Proceedings of the 2010 Internet Network Management Conference on Research on Enterprise Networking, pp. 3–3. USENIX Association (2010)
21.
Zurück zum Zitat Vahdat, A., Al-Fares, M., Farrington, N., Mysore, R., Porter, G., Radhakrishnan, S.: Scale-out networking in the data center. Micro, IEEE 30(4), 29–41 (2010)CrossRef Vahdat, A., Al-Fares, M., Farrington, N., Mysore, R., Porter, G., Radhakrishnan, S.: Scale-out networking in the data center. Micro, IEEE 30(4), 29–41 (2010)CrossRef
22.
Zurück zum Zitat White, T.: Hadoop: the definitive guide. O’Reilly, Media (2012) White, T.: Hadoop: the definitive guide. O’Reilly, Media (2012)
23.
Zurück zum Zitat Xie, J., Yin, S., Ruan, X., Ding, Z., Tian, Y., Majors, J., Manzanares, A., Qin, X.: Improving mapreduce performance through data placement in heterogeneous hadoop clusters. In: 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum (IPDPSW), pp. 1–9 (2010) Xie, J., Yin, S., Ruan, X., Ding, Z., Tian, Y., Majors, J., Manzanares, A., Qin, X.: Improving mapreduce performance through data placement in heterogeneous hadoop clusters. In: 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum (IPDPSW), pp. 1–9 (2010)
24.
Zurück zum Zitat Zaharia, M., Konwinski, A., Joseph, A., Katz, R., Stoica, I.: Improving mapreduce performance in heterogeneous environments. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, pp. 29–42 (2008) Zaharia, M., Konwinski, A., Joseph, A., Katz, R., Stoica, I.: Improving mapreduce performance in heterogeneous environments. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, pp. 29–42 (2008)
25.
Zurück zum Zitat Zhang, B., Qiu, J.: Accelerating data transfers in iterative MapReduce framework. Indiana University, USA (2012) Zhang, B., Qiu, J.: Accelerating data transfers in iterative MapReduce framework. Indiana University, USA (2012)
Metadaten
Titel
OFScheduler: A Dynamic Network Optimizer for MapReduce in Heterogeneous Cluster
Publikationsdatum
01.06.2015
Erschienen in
International Journal of Parallel Programming / Ausgabe 3/2015
Print ISSN: 0885-7458
Elektronische ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-013-0281-6

Weitere Artikel der Ausgabe 3/2015

International Journal of Parallel Programming 3/2015 Zur Ausgabe