research-article

AROMA: automated resource allocation and configuration of mapreduce environment in the cloud

Authors:
Palden Lama

University of Colorado at Colorado Springs, Colorado Springs, CO, USA

University of Colorado at Colorado Springs, Colorado Springs, CO, USA
View Profile

,
Xiaobo Zhou

University of Colorado at Colorado Springs, Colorado Springs, CO, USA

University of Colorado at Colorado Springs, Colorado Springs, CO, USA
View Profile

ICAC '12: Proceedings of the 9th international conference on Autonomic computingSeptember 2012Pages 63–72https://doi.org/10.1145/2371536.2371547

Published:18 September 2012Publication History

ICAC '12: Proceedings of the 9th international conference on Autonomic computing

Pages 63–72

ABSTRACT

Distributed data processing framework MapReduce is increasingly deployed in Clouds to leverage the pay-per-usage cloud computing model. Popular Hadoop MapReduce environment expects that end users determine the type and amount of Cloud resources for reservation as well as the configuration of Hadoop parameters. However, such resource reservation and job provisioning decisions require in-depth knowledge of system internals and laborious but often ineffective parameter tuning. We propose and develop AROMA, a system that automates the allocation of heterogeneous Cloud resources and configuration of Hadoop parameters for achieving quality of service goals while minimizing the incurred cost. It addresses the significant challenge of provisioning ad-hoc jobs that have performance deadlines in Clouds through a novel two-phase machine learning and optimization framework. Its technical core is a support vector machine based performance model that enables the integration of various aspects of resource provisioning and auto-configuration of Hadoop jobs. It adapts to ad-hoc jobs by robustly matching their resource utilization signature with previously executed jobs and making provisioning decisions accordingly. We implement AROMA as an automated job provisioning system for Hadoop MapReduce hosted in virtualized HP ProLiant blade servers. Experimental results show AROMA's effectiveness in providing performance guarantee of diverse Hadoop benchmark jobs while minimizing the cost of Cloud resource usage.

References

A. Abouzid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and A. Rasin. Hadoopdb: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. In Proc. of the VLDB, 2009. Google ScholarDigital Library
X. Bu, J. Rao, and C.-Z. Xu. A reinforcement learning approach to online web system auto-configuration. In Proc. IEEE Int'l Conference on Distributed Computing Systems (ICDCS), 2009. Google ScholarDigital Library
C. Chang and C. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 2011. Google ScholarDigital Library
T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmeleegy, and R. Sears. MapReduce online. In Proc. USENIX NSDI, 2010. Google ScholarDigital Library
J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 2008. Google ScholarDigital Library
J. Dittrich, J.-A. Quiané-Ruiz, A. Jindal, Y. Kargin, V. Setty, and J. Schad. Hadoop++: making a yellow elephant run like a cheetah (without it even noticing). Proc. of the VLDB, 3:515--529, 2010. Google ScholarDigital Library
Y. Geng, S. Chen, Y. Wu, R. Wu, G. Yang, and W. Zheng. Location-aware MapReduce in virtual cloud. In Proc. IEEE Int'l Conference on Parallel Processing (ICPP), 2011. Google ScholarDigital Library
F. Goiri, K. Le, J. Guitart, J. Torres, and R. Bianchini. Intelligent placement of datacenters for internet services. In Proc. IEEE Int'l Conference on Distributed Computing Systems (ICDCS), 2011. Google ScholarDigital Library
W. Guanying, A. Butt, P. Pandey, and K. Gupta. A simulation approach to evaluating design decisions in MapReduce setups. In Proc. IEEE Int'l Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2009.Google ScholarCross Ref
H. Herodotou and S. Babu. Profiling, what-if analysis, and cost-based optimization of MapReduce programs. In Proc. of the VLDB, 2011.Google Scholar
b. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In Proc. USENIX NSDI, 2011. Google ScholarDigital Library
K. Kambatla, A. Pathak, and H. Pucha. Towards optimizing hadoop provisioning in the cloud. In HotCloud Workshop in conjunction with USENIX Annual Technical Conference, 2009. Google ScholarDigital Library
P. Lama and X. Zhou. Autonomic provisioning with self-adaptive neural fuzzy control for end-to-end delay guarantee. In Proc. IEEE/ACM Int'l Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2010. Google ScholarDigital Library
P. Lama and X. Zhou. PERFUME: Power and performance guarantee with fuzzy mimo control in virtualized servers. In Proc. IEEE Int'l Workshop on Quality of Service (IWQoS), 2010. Google ScholarDigital Library
G. Lee, B. Chun, and H. K. Randy. Heterogeneity-aware resource allocation and scheduling in the cloud. In HotCloud Workshop in conjunction with USENIX Annual Technical Conference, 2011. Google ScholarDigital Library
R. Lee, T. Luo, F. Wang, Y. Huai, Y. He, and X. Zhang. Ysmart: Yet another SQL-to-MapReduce translator. In Proc. IEEE Int'l Conference on Distributed Computing Systems (ICDCS), 2011. Google ScholarDigital Library
X. Meng, C. Isci, J. Kephart, L. Zhang, and E. Bouillet. Efficient resource provisioning in compute clouds via vm multiplexing. In Proc. Int'l Conference on Autonomic Computing (ICAC), 2010. Google ScholarDigital Library
J. Polo, D. Carrera, Y. Becerra, J. Torres, E. Ayguade, M. Steinder, and I. Whalley. Performance-driven task co-scheduling for MapReduce environments. In Proc. of the IEEE/IFIP Network Operations and Management Symposium (NOMS), 2010.Google ScholarCross Ref
J. Rao, X. Bu, C. Xu, L. Wang, and G. Yin. Vconf: A reinforcement learning approach to virtual machines auto-conguration. In Proc. IEEE Int'l Conference on Autonomic Computing Systems (ICAC), 2009. Google ScholarDigital Library
J. Rao and C. Xu. CoSL: a coordinated statistical learning approach to measuring the capacity of multi-tier Websites. In Proc. IEEE Int'l Parallel and Distributed Processing Symposium (IPDPS), 2008.Google Scholar
L. Shi, X. Li, and K. L. Tan. S3: An efficient shared scan scheduler on MapReduce framework. In Proc. IEEE Int'l Conference on Parallel Processing (ICPP), 2011. Google ScholarDigital Library
R. Singh, U. Sharma, E. Cecchet, and P. Shenoy. Autonomic mix-aware provisioning for non-stationary data center workloads. In Proc. IEEE Int'l Conference on Autonomic Computing (ICAC), 2010. Google ScholarDigital Library
A. Verma, L. Cherkasova, and R. Campbell. ARIA: automatic resource inference and allocation for MapReduce environments. In Proc. IEEE/ACM Int'l Conference on Autonomic Computing (ICAC), 2011. Google ScholarDigital Library
D. Warneke and O. Kao. Exploiting dynamic resource allocation for efficient parallel data processing in the cloud. IEEE Trans. on Parallel and Distributed Systems, 22(6), 2011. Google ScholarDigital Library
P. Xiong, Z. Wang, S. Malkowski, D. Jayasinghe, Q. Wang, and C. Pu. Economical and robust provisioning of n-tier cloud workloads: A multi-level control approach. In Proc. IEEE Int'l Conference on Distributed Computing Systems (ICDCS), 2011. Google ScholarDigital Library
J. Xu and J. Fortes. A multi-objective approach to virtual machine management in datacenters. In Proc. of IEEE/ACM Int'l Conference on Autonomic computing (ICAC), 2011. Google ScholarDigital Library
M. Zaharia, A. Konwinshi, A. D. Josepj, R. Katz, and I. Stoica. Improving MapReduce performance in heterogeneous environments. In Proc. the USENIX OSDI, 2008. Google ScholarDigital Library

Index Terms

AROMA: automated resource allocation and configuration of mapreduce environment in the cloud

Recommendations

ARIA: automatic resource inference and allocation for mapreduce environments
ICAC '11: Proceedings of the 8th ACM international conference on Autonomic computing

MapReduce and Hadoop represent an economically compelling alternative for efficient large scale data processing and advanced analytics in the enterprise. A key challenge in shared MapReduce clusters is the ability to automatically tailor and control ...
Read More
Multi-Tier Resource Allocation for Data-Intensive Computing

As distributed computing systems are used more widely, driven by trends such as 'big data' and cloud computing, they are being used for an increasingly wide range of applications. With this massive increase in application heterogeneity, the ability to ...
Read More
Multi-policy-aware MapReduce resource allocation and scheduling for smart computing cluster

When a user submit a MapReduce job in the smart computing cluster, we first need to allocate cluster resource for the job. It is widely concerned that how to save time and resource costs to provide users with computing capacity and services. Here, we ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICAC '12: Proceedings of the 9th international conference on Autonomic computing
September 2012
222 pages
ISBN:9781450315203
DOI:10.1145/2371536
General Chair:
Dejan Milojicic
HP Labs
,
Program Chairs:
Dongyan Xu
Purdue University
,
Vanish Talwar
HP Labs
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 September 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
auto-configuration
mapreduce
resource allocation
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 154
  Total Citations
  View Citations
- 1,251
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

AROMA: automated resource allocation and configuration of mapreduce environment in the cloud

ICAC '12: Proceedings of the 9th international conference on Autonomic computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

ARIA: automatic resource inference and allocation for mapreduce environments

Multi-Tier Resource Allocation for Data-Intensive Computing

Multi-policy-aware MapReduce resource allocation and scheduling for smart computing cluster

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

AROMA: automated resource allocation and configuration of mapreduce environment in the cloud

ICAC '12: Proceedings of the 9th international conference on Autonomic computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

ARIA: automatic resource inference and allocation for mapreduce environments

Multi-Tier Resource Allocation for Data-Intensive Computing

Multi-policy-aware MapReduce resource allocation and scheduling for smart computing cluster

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media