ABSTRACT
Distributed data processing framework MapReduce is increasingly deployed in Clouds to leverage the pay-per-usage cloud computing model. Popular Hadoop MapReduce environment expects that end users determine the type and amount of Cloud resources for reservation as well as the configuration of Hadoop parameters. However, such resource reservation and job provisioning decisions require in-depth knowledge of system internals and laborious but often ineffective parameter tuning. We propose and develop AROMA, a system that automates the allocation of heterogeneous Cloud resources and configuration of Hadoop parameters for achieving quality of service goals while minimizing the incurred cost. It addresses the significant challenge of provisioning ad-hoc jobs that have performance deadlines in Clouds through a novel two-phase machine learning and optimization framework. Its technical core is a support vector machine based performance model that enables the integration of various aspects of resource provisioning and auto-configuration of Hadoop jobs. It adapts to ad-hoc jobs by robustly matching their resource utilization signature with previously executed jobs and making provisioning decisions accordingly. We implement AROMA as an automated job provisioning system for Hadoop MapReduce hosted in virtualized HP ProLiant blade servers. Experimental results show AROMA's effectiveness in providing performance guarantee of diverse Hadoop benchmark jobs while minimizing the cost of Cloud resource usage.
- A. Abouzid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and A. Rasin. Hadoopdb: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. In Proc. of the VLDB, 2009. Google ScholarDigital Library
- X. Bu, J. Rao, and C.-Z. Xu. A reinforcement learning approach to online web system auto-configuration. In Proc. IEEE Int'l Conference on Distributed Computing Systems (ICDCS), 2009. Google ScholarDigital Library
- C. Chang and C. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 2011. Google ScholarDigital Library
- T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmeleegy, and R. Sears. MapReduce online. In Proc. USENIX NSDI, 2010. Google ScholarDigital Library
- J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 2008. Google ScholarDigital Library
- J. Dittrich, J.-A. Quiané-Ruiz, A. Jindal, Y. Kargin, V. Setty, and J. Schad. Hadoop++: making a yellow elephant run like a cheetah (without it even noticing). Proc. of the VLDB, 3:515--529, 2010. Google ScholarDigital Library
- Y. Geng, S. Chen, Y. Wu, R. Wu, G. Yang, and W. Zheng. Location-aware MapReduce in virtual cloud. In Proc. IEEE Int'l Conference on Parallel Processing (ICPP), 2011. Google ScholarDigital Library
- F. Goiri, K. Le, J. Guitart, J. Torres, and R. Bianchini. Intelligent placement of datacenters for internet services. In Proc. IEEE Int'l Conference on Distributed Computing Systems (ICDCS), 2011. Google ScholarDigital Library
- W. Guanying, A. Butt, P. Pandey, and K. Gupta. A simulation approach to evaluating design decisions in MapReduce setups. In Proc. IEEE Int'l Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2009.Google ScholarCross Ref
- H. Herodotou and S. Babu. Profiling, what-if analysis, and cost-based optimization of MapReduce programs. In Proc. of the VLDB, 2011.Google Scholar
- b. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In Proc. USENIX NSDI, 2011. Google ScholarDigital Library
- K. Kambatla, A. Pathak, and H. Pucha. Towards optimizing hadoop provisioning in the cloud. In HotCloud Workshop in conjunction with USENIX Annual Technical Conference, 2009. Google ScholarDigital Library
- P. Lama and X. Zhou. Autonomic provisioning with self-adaptive neural fuzzy control for end-to-end delay guarantee. In Proc. IEEE/ACM Int'l Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2010. Google ScholarDigital Library
- P. Lama and X. Zhou. PERFUME: Power and performance guarantee with fuzzy mimo control in virtualized servers. In Proc. IEEE Int'l Workshop on Quality of Service (IWQoS), 2010. Google ScholarDigital Library
- G. Lee, B. Chun, and H. K. Randy. Heterogeneity-aware resource allocation and scheduling in the cloud. In HotCloud Workshop in conjunction with USENIX Annual Technical Conference, 2011. Google ScholarDigital Library
- R. Lee, T. Luo, F. Wang, Y. Huai, Y. He, and X. Zhang. Ysmart: Yet another SQL-to-MapReduce translator. In Proc. IEEE Int'l Conference on Distributed Computing Systems (ICDCS), 2011. Google ScholarDigital Library
- X. Meng, C. Isci, J. Kephart, L. Zhang, and E. Bouillet. Efficient resource provisioning in compute clouds via vm multiplexing. In Proc. Int'l Conference on Autonomic Computing (ICAC), 2010. Google ScholarDigital Library
- J. Polo, D. Carrera, Y. Becerra, J. Torres, E. Ayguade, M. Steinder, and I. Whalley. Performance-driven task co-scheduling for MapReduce environments. In Proc. of the IEEE/IFIP Network Operations and Management Symposium (NOMS), 2010.Google ScholarCross Ref
- J. Rao, X. Bu, C. Xu, L. Wang, and G. Yin. Vconf: A reinforcement learning approach to virtual machines auto-conguration. In Proc. IEEE Int'l Conference on Autonomic Computing Systems (ICAC), 2009. Google ScholarDigital Library
- J. Rao and C. Xu. CoSL: a coordinated statistical learning approach to measuring the capacity of multi-tier Websites. In Proc. IEEE Int'l Parallel and Distributed Processing Symposium (IPDPS), 2008.Google Scholar
- L. Shi, X. Li, and K. L. Tan. S3: An efficient shared scan scheduler on MapReduce framework. In Proc. IEEE Int'l Conference on Parallel Processing (ICPP), 2011. Google ScholarDigital Library
- R. Singh, U. Sharma, E. Cecchet, and P. Shenoy. Autonomic mix-aware provisioning for non-stationary data center workloads. In Proc. IEEE Int'l Conference on Autonomic Computing (ICAC), 2010. Google ScholarDigital Library
- A. Verma, L. Cherkasova, and R. Campbell. ARIA: automatic resource inference and allocation for MapReduce environments. In Proc. IEEE/ACM Int'l Conference on Autonomic Computing (ICAC), 2011. Google ScholarDigital Library
- D. Warneke and O. Kao. Exploiting dynamic resource allocation for efficient parallel data processing in the cloud. IEEE Trans. on Parallel and Distributed Systems, 22(6), 2011. Google ScholarDigital Library
- P. Xiong, Z. Wang, S. Malkowski, D. Jayasinghe, Q. Wang, and C. Pu. Economical and robust provisioning of n-tier cloud workloads: A multi-level control approach. In Proc. IEEE Int'l Conference on Distributed Computing Systems (ICDCS), 2011. Google ScholarDigital Library
- J. Xu and J. Fortes. A multi-objective approach to virtual machine management in datacenters. In Proc. of IEEE/ACM Int'l Conference on Autonomic computing (ICAC), 2011. Google ScholarDigital Library
- M. Zaharia, A. Konwinshi, A. D. Josepj, R. Katz, and I. Stoica. Improving MapReduce performance in heterogeneous environments. In Proc. the USENIX OSDI, 2008. Google ScholarDigital Library
Index Terms
- AROMA: automated resource allocation and configuration of mapreduce environment in the cloud
Recommendations
ARIA: automatic resource inference and allocation for mapreduce environments
ICAC '11: Proceedings of the 8th ACM international conference on Autonomic computingMapReduce and Hadoop represent an economically compelling alternative for efficient large scale data processing and advanced analytics in the enterprise. A key challenge in shared MapReduce clusters is the ability to automatically tailor and control ...
Multi-Tier Resource Allocation for Data-Intensive Computing
As distributed computing systems are used more widely, driven by trends such as 'big data' and cloud computing, they are being used for an increasingly wide range of applications. With this massive increase in application heterogeneity, the ability to ...
Multi-policy-aware MapReduce resource allocation and scheduling for smart computing cluster
When a user submit a MapReduce job in the smart computing cluster, we first need to allocate cluster resource for the job. It is widely concerned that how to save time and resource costs to provide users with computing capacity and services. Here, we ...
Comments