nach oben

Erschienen in:

15.12.2018

Fast Recovery MapReduce (FAR-MR) to accelerate failure recovery in big data applications

verfasst von: Yongqing Zhu, Juniarto Samsudin, Renuga Kanagavelu, Weiwen Zhang, Long Wang, Theint Theint Aye, Rick Siow Mong Goh

Erschienen in: The Journal of Supercomputing | Ausgabe 5/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Existing Hadoop MapReduce fault tolerance strategy causes the computing jobs suffering from high performance penalty during failure recovery. In this paper, we propose Fast Recovery MapReduce (FAR-MR) to improve MapReduce performance in failure recovery. FAR-MR includes a novel fault tolerance strategy that combines distributed checkpointing and proactive push mechanism to support fast recovery from task failure and node failure. With distributed checkpointing, computing progress of each task is recorded as checkpoints periodically and kept in distributed data storage. The recovered task can obtain the last progress of the failed task from the distributed storage during failure recovery. In addition, the proactive push mechanism enables the computing results of map tasks to be proactively transmitted to the nodes hosting reduce tasks of the same computing job. When a failure happens, the partial output results being pushed to the reducer nodes can be used by the reduce tasks without the necessity of re-compute. FAR-MR allows a failed task to be recovered efficiently at any node in the cluster. The performance evaluation has shown that the proposed FAR-MR can improve computing job performance by up to 62% and 45% compared to Hadoop MapReduce in the case of task failure recovery and node failure recovery, respectively.

Vorheriger Artikel A decade of big data literature: analysis of trends in light of bibliometrics

Nächster Artikel Editor’s note

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Raghupathi W, Raghupathi V (2014) Big data analytics in healthcare: promise and potential. Health Inf Sci Syst 2:3CrossRef

Cattaneo G, Petrillo UF, Giancarlo R et al (2017) An effective extension of the applicability of alignment-free biological sequence comparison algorithms with Hadoop. J Supercomput 73(4):1467–1483. https://doi.org/10.1007/s11227-016-1835-3 CrossRef

Cardenas AA, Manadhata PK, Rajan SP (2013) Big data analytics for security. IEEE Secur Priv 11(6):74–76CrossRef

Zhu Y, Juniarto S, Shi H, Wang J (2015) VH-DSI: speeding up data visualization via a heterogeneous distributed storage infrastructure. In: Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems (ICPADS 2015), pp 658–665

Lin KC, Zhang KY, Huang YH et al (2016) Feature selection based on an improved cat swarm optimization algorithm for big data classification. J Supercomput 72(8):3210–3221. https://doi.org/10.1007/s11227-016-1631-0 CrossRef

Dean J, Ghemawat S (2008) Map-Reduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef

Apache Hadoop YARN. http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html. Accessed 2012

Rahman MT, Gabriel E, Subhlok J (2017) Performance implications of failures on MapReduce applications. In: Proceedings of 2017 IEEE International Conference on Cluster Computing, pp 741–748

Yang C, Yen C, Tan C, Madden SR (2010) Osprey: implementing MapReduce-style fault tolerance in a shared-nothing distributed database. In: Proceedings of IEEE ICDE, pp 657–668

10.

Wang G, Butt AR, Pandey P, Gupta K (2009) A simulation approach to evaluating design decisions in MapReduce setups. In: Proceedings of IEEE/ACM MASCOTS, pp 1–11

11.

Khalil S, Salem SA, Nassar S, Saad EM (2013) MapReduce performance in heterogeneous environments: a review. Int J Sci Eng Res 4(4):410–416

12.

Carlson JL (2013) Redis in action. Manning Publications, Greenwich

13.

Fitzpatrick B (2004) Distributed caching with memcached. Linux J 2004(124):72–78

14.

Chervenak A, Foster I, Kesselman C, Salisbury C, Tuecke S (2000) The data grid: towards an architecture for the distributed management and analysis of large scientific data sets. J Netw Comput Appl 23:187CrossRef

15.

Cui X, Zhu P, Yang X et al (2014) Optimized big data K-means clustering using MapReduce. J Supercomput 70(3):1249–1259. https://doi.org/10.1007/s11227-014-1225-7 CrossRef

16.

Choi H, Lee KH, Lee YJ (2014) Parallel labeling of massive XML data with MapReduce. J Supercomput 67(2):408–437. https://doi.org/10.1007/s11227-013-1008-6 CrossRef

17.

Slagter K, Hsu CH, Chung YC et al (2013) An improved partitioning mechanism for optimizing massive data analysis using MapReduce. J Supercomput 66(1):539–555. https://doi.org/10.1007/s11227-013-0924-9 CrossRef

18.

Treaster M (2005) A survey of Fault-tolerance and Fault-recovery techniques in parallel systems. Technical Report cs.DC/0501002, ACM Computing Research Repository (CoRR)

19.

Zaharia M, Konwinski A, Joseph AD, Katz R, Stoica I (2008) Improving MapReduce performance in heterogeneous environments. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI’08, USA, pp 29–42

20.

Chen Q, Zhang D, Guo M, Deng Q, Guo S (2010) SAMR: a selfadaptive MapReduce scheduling algorithm in heterogeneous environment. In: Proceedings of the IEEE 10th International Conference on Computer and Information Technology, pp 2736–2743

21.

Ananthanarayanan G, Kandula S, Greenberg A, Stoica I, Lu Y, Saha B, Harris E (2010) Reining in the outliers in map-reduce clusters using Mantri. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, OSDI’10, USA, pp 1–16

22.

Wang Y, Fu H, Yu W (2015) Cracking down MapReduce failure amplification through analytics logging and migration. In: Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS’15), pp 261–270

23.

Gates A et al (2009) Building a highlevel dataflow system on top of MapReduce: the pig experience. PVLDB 2(2):1414

24.

Thusoo A et al (2009) Hive—a warehousing solution over a Map-Reduce framework. PVLDB 2(2):1626

25.

Balazinska M, Balakrishnan H, Madden SR, Stonebraker M (2008) Fault-tolerance in the borealis distributed stream processing system. ACM Trans Database Syst 33(1):3CrossRef

26.

Hwang J-H, Xing Y, Cetintemel U, Zdonik S (2007) A cooperative, self-configuring high-availability solution for stream processing. In: Proceedings of the IEEE 23rd International Conference on Data Engineering, pp 176–185

27.

Liedes A-P, Wolski A (2006) SIREN: a memory-conserving, snapshot-consistent checkpoint algorithm for in-memory databases. In: Proceedings of the 22nd International Conference on Data Engineering, pp 99–99

28.

Quiané-Ruiz J-A, Pinkel C, Schad J (2011) RAFTing MapReduce: fast recovery on the RAFT. In: Proceedings of the IEEE 27th International Conference on Data Engineering (ICDE’11), pp 589–600

29.

Lin C-Y, Chen T-H, Cheng Y-N (2013) On improving fault tolerance for heterogeneous Hadoop MapReduce clusters. In: Proceedings of 2013 IEEE International Conference on Cloud Computing and Big Data, pp 38–43

30.

Wang H, Chen H, Zhenwei D, Fei H (2016) BeTL: MapReduce checkpoint tactics beneath the task level. IEEE Trans Serv Comput 9:84–95

31.

Wang H, Chen H, Hu F (2014) Rect: improving MapReduce performance under failures with resilient checkpointing tactics. In: Proceedings of the IEEE International Conference Big Data (Big Data), pp 27–32

Titel: Fast Recovery MapReduce (FAR-MR) to accelerate failure recovery in big data applications
verfasst von: Yongqing Zhu
Juniarto Samsudin
Renuga Kanagavelu
Weiwen Zhang
Long Wang
Theint Theint Aye
Rick Siow Mong Goh
Publikationsdatum: 15.12.2018
Verlag: Springer US
Erschienen in: The Journal of Supercomputing / Ausgabe 5/2020
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-018-2716-8

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 5/2020

Facial expression recognition using iterative fusion of MO-HOG and deep features

A decade of big data literature: analysis of trends in light of bibliometrics

Research on enterprise radical innovation based on machine learning in big data background

Editor’s note

Computing dynamic across-wind response of tall buildings using artificial neural network

Research on optimization and application of evaluation algorithm for intelligent city