ABSTRACT
In order to improve the performance of MapReduce, we design DynMR. It addresses the following problems that persist in the existing implementations: 1) difficulty in selecting optimal performance parameters for a single job in a fixed, dedicated environment, and lack of capability to configure parameters that can perform optimally in a dynamic, multi-job cluster; 2) long job execution resulting from a task long-tail effect, often caused by ReduceTask data skew or heterogeneous computing nodes; 3) inefficient use of hardware resources, since ReduceTasks bundle several functional phases together and may idle during certain phases.
DynMR adaptively interleaves the execution of several partially-completed ReduceTasks and backfills MapTasks so that they run in the same JVM, one at a time. It consists of three components. 1) A running ReduceTask uses a detection algorithm to identify resource underutilization during the shuffle phase. It then gives up the allocated hardware resources efficiently to the next task. 2) A number of ReduceTasks are gradually assembled in a progressive queue, according to a flow control algorithm in runtime. These tasks execute in an interleaved rotation. Additional ReduceTasks can be inserted adaptively to the progressive queue if the full fetching capacity is not reached. MapTasks can be back-filled therein if it is still underused. 3) Merge threads of each ReduceTask are extracted out as standalone services within the associated JVM. This design allows the data segments of multiple partially-complete ReduceTasks to reside in the same JVM heap, controlled by a segment manager and served by the common merge threads. Experiments show 10% ~ 40% improvements, depending on the workload.
- Fair Scheduler, http://hadoop.apache.org/docs/stable/fair_scheduler.html.Google Scholar
- Apache Hadoop. http://hadoop.apache.org/.Google Scholar
- Apache Spark. https://spark.incubator.apache.org/.Google Scholar
- IBM Platform Computing. http://www.redbooks.ibm.com/redbooks/pdfs/sg248073.pdf.Google Scholar
- Apache Hadoop NextGen MapReduce (YARN). http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html.Google Scholar
- Comparison of IBM Platform Symphony and Apache Hadoop using Berkeley SWIM. http://public.dhe.ibm.com/common/ssi/ecm/en/dcl12351usen/DCL12351USEN.PDF, November 2012.Google Scholar
- F. Ahmad, S. T. Chakradhar, A. Raghunathan, and T. N. Vijaykumar. Tarazu: optimizing MapReduce on heterogeneous clusters. In Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '12, pages 61--74, London, England, UK, 2012. ACM. Google ScholarDigital Library
- G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. Reining in the outliers in Map-Reduce clusters using Mantri. In Proceedings of the 9th USENIX conference on Operating systems design and implementation, OSDI'10, pages 1--16, Vancouver, BC, Canada, 2010. USENIX Association. Google ScholarDigital Library
- G. Ananthanarayanan, C. Douglas, R. Ramakrishnan, S. Rao, and I. Stoica. True elasticity in multi-tenant data-intensive compute clusters. In Proceedings of the Third ACM Symposium on Cloud Computing, SoCC '12, pages 24:1--24:7, San Jose, California, 2012. ACM. Google ScholarDigital Library
- G. Ananthanarayanan, A. Ghodsi, A. Wang, D. Borthakur, S. Kandula, S. Shenker, and I. Stoica. PACMan: coordinated memory caching for parallel jobs. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, NSDI'12, San Jose, CA, 2012. USENIX Association. Google ScholarDigital Library
- L. Cheng, Q. Zhang, and R. Boutaba. Mitigating the negative impact of preemption on heterogeneous MapReduce workloads. In Proceedings of the 7th International Conference on Network and Services Management, CNSM '11, pages 189--197. International Federation for Information Processing, 2011. Google ScholarDigital Library
- B. Cho, M. Rahman, T. Chajed, I. Gupta, C. Abad, N. Roberts, and P. Lin. Natjam: Design and evaluation of eviction policies for supporting priorities and deadlines in mapreduce clusters. In Proceedings of the 4th Annual Symposium on Cloud Computing, SOCC '13, pages 6:1--6:17, Santa Clara, California, 2013. ACM. Google ScholarDigital Library
- P. Costa, A. Donnelly, A. Rowstron, and G. O'Shea. Camdoop: exploiting in-network aggregation for big data applications. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, NSDI'12, pages 3--3, Berkeley, CA, USA, 2012. USENIX Association. Google ScholarDigital Library
- J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation - Volume 6, OSDI'04, San Francisco, California, 2004. USENIX Association. Google ScholarDigital Library
- H. Herodotou, H. Lim, G. Luo, N. Borisov, L. Dong, F. B. Cetin, and S. Babu. Starfish: A self-tuning system for big data analytics. In the 5th Biennial Conference on Innovative Data Systems Research (CIDR'11), pages 261--272, Asilomar, California, USA, 2011.Google Scholar
- M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, EuroSys '07, pages 59--72, Lisbon, Portugal, 2007. ACM. Google ScholarDigital Library
- M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: fair scheduling for distributed computing clusters. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP '09, pages 261--276, Big Sky, Montana, USA, 2009. ACM. Google ScholarDigital Library
- D. Jiang, B. C. Ooi, L. Shi, and S. Wu. The performance of MapReduce: an in-depth study. Proceedings of the VLDB Endowment, 3(1-2):472--483, September 2010. Google ScholarDigital Library
- S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan. An analysis of traces from a production MapReduce cluster. In Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, CCGRID '10, pages 94--103, Washington, DC, USA, 2010. Google ScholarDigital Library
- Y. Kwon, M. Balazinska, B. Howe, and J. Rolia. Skewtune: mitigating skew in MapReduce applications. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, SIGMOD '12, pages 25--36, Scottsdale, Arizona, USA, 2012. ACM. Google ScholarDigital Library
- D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi. Naiad: A timely dataflow system. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pages 439--455, Farminton, Pennsylvania, 2013. ACM. Google ScholarDigital Library
- J. Tan, X. Meng, and L. Zhang. Delay tails in MapReduce scheduling. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems, SIGMETRICS '12, pages 5--16, London, England, UK, 2012. ACM. Google ScholarDigital Library
- J. Tan, Y. Wang, W. Yu, and L. Zhang. Non-work-conserving effects in MapReduce: diffusion limit and criticality. In Proceedings of the 14th ACM SIGMETRICS international conference on Measurement and Modeling of Computer Systems, SIGMETRICS '14, Austin, Texas, USA, 2014. ACM.Google ScholarDigital Library
- V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth, B. Saha, C. Curino, O. O'Malley, S. Radia, B. Reed, and E. Baldeschwieler. Apache Hadoop YARN: Yet Another Resource Negotiator. In Proceedings the Symposium on Cloud Computing (SoCC'13), Santa Clata, CA, USA, 2013. ACM. Google ScholarDigital Library
- Y. Wang, X. Que, W. Yu, D. Goldenberg, and D. Sehgal. Hadoop acceleration through network levitated merge. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC '11, pages 57:1--57:10, Seattle, Washington, 2011. ACM. Google ScholarDigital Library
- Y. Wang, J. Tan, W. Yu, L. Zhang, and X. Meng. Preemptive ReduceTask scheduling for fair and fast job completion. In Proceedings of the USENIX International Conference on Autonomic Computing, ICAC'13, San Jose, CA, USA, 2013.Google Scholar
- M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica. Improving mapreduce performance in heterogeneous environments. In Proceedings of the 8th USENIX conference on Operating systems design and implementation, OSDI'08, pages 29--42, San Diego, California, 2008. USENIX Association. Google ScholarDigital Library
- M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In Proceedings of the 5th European conference on Computer systems, EuroSys '10, pages 265--278, Paris, France, 2010. ACM. Google ScholarDigital Library
- DynMR: dynamic MapReduce with ReduceTask interleaving and MapTask backfilling
Comments