research-article

A comparative analysis of iterative MapReduce systems

Authors:
Minseo Kang

Graduate School of Knowledge Service Engineering, KAIST, Korea

Graduate School of Knowledge Service Engineering, KAIST, Korea
View Profile

,
Jae-Gil Lee

Graduate School of Knowledge Service Engineering, KAIST, Korea

Graduate School of Knowledge Service Engineering, KAIST, Korea
View Profile

EDB '16: Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and TheoryOctober 2016Pages 61–64https://doi.org/10.1145/3007818.3007819

Published:17 October 2016Publication History

EDB '16: Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and Theory

Pages 61–64

ABSTRACT

Since the development of MapReduce, there have been several efforts to extend data mining and machine learning algorithms for MapReduce. Many of those algorithms are iterative by nature. In order to process them efficiently, Spark as well as research prototypes such as HaLoop, iMapReduce, and Twister are proposed with solutions to iterative computation. In this paper, we thoroughly examine the pros and cons of each system.

References

Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst. HaLoop: Efficient iterative data processing on large clusters. Proceedings of the VLDB Endowment, 3(1--2):285&sim;296, 2010. Google ScholarDigital Library
Y. Zhang, Q. Gao, L. Gao, and C. Wang. iMapreduce: A distributed computing framework for iterative computation. Journal of Grid Computing, 10(1):47&sim;68, 2012. Google ScholarDigital Library
J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S. H. Bae, J. Qiu, and G. Fox. Twister: A runtime for iterative MapReduce. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pages 810&sim;818, 2010. Google ScholarDigital Library
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, pages 10, 2010. Google ScholarDigital Library
T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmeleegy, and R. Sears. MapReduce Online. In Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, pages 21, 2010. Google ScholarDigital Library
E. Elnikety, T. Elsayed, and H. E. Ramadan. iHadoop: Asynchronous iterations for MapReduce. In Proceedings of the 2011 IEEE 3rd International Conference on Cloud Computing Technology and Science, pages 81&sim;90, 2011. Google ScholarDigital Library
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pages 135&sim;146, 2010. Google ScholarDigital Library

Recommendations

An Experimental Comparison of Iterative MapReduce Frameworks
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

MapReduce has become a dominant framework in big data analysis, and thus there have been significant efforts to implement various data analysis algorithms in MapReduce. Many data analysis algorithms are inherently iterative, repeating the same set of ...
Read More
An experimental analysis of limitations of MapReduce for iterative algorithms on Spark

MapReduce is the most popular framework for distributed processing. Recently, the scalability of data mining and machine learning algorithms has significantly improved with help from MapReduce. However, MapReduce does not handle iterative algorithms ...
Read More
MapReduce: Review and open challenges

The continuous increase in computational capacity over the past years has produced an overwhelming flow of data or big data, which exceeds the capabilities of conventional processing tools. Big data signify a new era in data exploration and utilization. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EDB '16: Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and Theory
October 2016
183 pages
ISBN:9781450347549
DOI:10.1145/3007818
General Chairs:
Jinho Kim,
Young-Kuk Kim,
James Geller,
Program Chairs:
Wonik Choi,
Carson K. Leung,
Young-Ho Park
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
MapReduce
hadoop
iterative algorithms
spark
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 140
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A comparative analysis of iterative MapReduce systems

EDB '16: Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and Theory

ABSTRACT

References

Cited By

Recommendations

An Experimental Comparison of Iterative MapReduce Frameworks

An experimental analysis of limitations of MapReduce for iterative algorithms on Spark

MapReduce: Review and open challenges

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A comparative analysis of iterative MapReduce systems

EDB '16: Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and Theory

ABSTRACT

References

Cited By

Recommendations

An Experimental Comparison of Iterative MapReduce Frameworks

An experimental analysis of limitations of MapReduce for iterative algorithms on Spark

MapReduce: Review and open challenges

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media