Top

The Journal of Supercomputing

Published in:

09-05-2019

NADE: nodes performance awareness and accurate distance evaluation for degraded read in heterogeneous distributed erasure code-based storage

Authors: Xingjun Zhang, Yi Cai, Yunfei Liu, Zhiwei Xu, Xiaoshe Dong

Published in: The Journal of Supercomputing | Issue 7/2020

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

To ensure data availability and save storage space, storage systems usually save data across multiple storage nodes (or servers) using erasure codes. Storage systems need to reconstruct the complete data to respond to reading requests in the case of the loss of some data blocks when node failure occurs. However, a degraded read in erasure code-based storage systems does not fully utilize node resources and ignores the node’s topology. In this paper, we propose a real-time performance evaluation model for storage nodes to evaluate the performance of each node combining a metrics choice and an analytic hierarchy process. We also design a cost evaluation method to calculate the transmission cost by considering the node’s topology. By combining the node evaluation method and a distance calculation, we propose an adaptive degraded read optimization strategy, NADE. We further implement the node selection method NADE in Ceph. The evaluation results show the efficiency of the proposed method.

previous article MapReduce scheduling algorithms: a review

next article Editor’s note

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Jagadish HV, Gehrke J, Labrinidis A, Papakonstantinou Y, Patel JM, Ramakrishnan R, Shahabi C (2014) Big data and its technical challenges. Commun ACM 577:86–94CrossRef

Turner V, Reinsel D, Gantz JF, Minton S (2014) The digital universe of opportunities: rich data and increasing value of the internet of things. IDC Analyze the Future,16

Sanjay G, Gobioff H, Leung S (2003) The Google file system. ACM SIGOPS Oper Syst Rev 375:29–43

Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. MSST, pp 1–10

Decandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W (2007) Dynamo: Amazon’s highly available key-value store. ACM SIGOPS Oper Syst Rev 416:205–220CrossRef

Lakshman A, Malik P (2010) Cassandra: a decentralized structured storage system. ACM SIGOPS Oper Syst Rev 442:35–40CrossRef

HDFS-RAID wiki (2018) http://wiki.apache.org/hadoop/HDFS-RAID. Accessed 12 June 2018

Ceph Erasure Code (2018) http://docs.ceph.com/docs/master/rados/operations/erasure-code/. Accessed 12 June 2018

LUSEP, ANDGREENAN (2014) Swift object storage: adding erasure codes

10.

Sheepdog Erasure Code (2018) https://github.com/sheepdog/sheepdog/wiki/Erasure-Code-Support. Accessed 12 June 2018

11.

Reed IS, Solomon G (1960) Polynomial codes over certain finite fields. J Soc Ind Appl Math 82:300–304MathSciNetCrossRef

12.

Xiang L, Xu Y, Lui JCS, Chang Q (2010) Optimal recovery of single disk failure in RDP code storage systems. ACM SIGMETRICS Perform Eval Rev 381:119–130

13.

Zhu Y, Lin J, Lee PPC, Xu Y (2015) Boosting degraded reads in heterogeneous erasure-coded storage systems. IEEE Trans Comput 648:2145–2157MathSciNetCrossRef

14.

Shen Z, Shu J, Lee PPC (2016) Reconsidering single failure recovery in clustered file systems. In: IEEE/IFIP International Conference on Dependable Systems and Networks, pp 323–334

15.

Weil SA, Brandt SA, Miller EL, Long DD, Maltzahn C (2006) Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation. USENIX Association, pp 307–320

16.

Huang C, Simitci H, Xu Y, Ogus A, Calder B, Gopalan P, Li J, Yekhanin S (2012) Erasure coding in windows azure storage. In: USENIX Conference on Technical Conference, pp 2–2

17.

Miyamae T, Nakao T, Shiozawa K (2014) Erasure code with shingled local parity groups for efficient recovery from multiple disk failures. In: USENIX Conference on Hot Topics in System Dependability, pp 5–5

18.

Rashmi KV, Shah NB, Kumar PV (2011) Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction. IEEE Trans Inf Theory 578:5227–5239MathSciNetCrossRef

19.

Khan O, Burns R, Plank J, Pierce W, Huang C (2012) Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads. In: USENIX Conference on File and Storage Technologies, pp 20–20

20.

Zhu Y, Lee PPC, Xu Y, Hu Y, Xiang L (2014) On the speedup of recovery in large-scale erasure-coded storage systems. IEEE Trans Parallel Distrib Syst 257:1830–1840CrossRef

21.

Shen Z, Lee PPC, Shu J, Guo W (2017) Cross-rack-aware single failure recovery for clustered file systems. IEEE Trans Depend Secure Comput. https://doi.org/10.1109/TDSC.2017.2774299 CrossRef

22.

Zhang J, Liao X, Li S, Hua Y (2014) Aggrecode: constructing route intersection for data reconstruction in erasure coded storage. In: INFOCOM, 2014 Proceedings IEEE, pp 2139–2147

23.

Zhang H, Li H, Li SY (2017) Repair tree: fast repair for single failure in erasure-coded distributed storage systems. IEEE Trans Parallel Distrib Syst 28(6):1728–1739CrossRef

24.

Mitra S, Panta R, Ra MR, Bagchi S (2016) Partial-parallel-repair (PPR): a distributed technique for repairing erasure coded storage. In: Eleventh European Conference on Computer Systems, pp 1–16

25.

Li P, Jin X, Stones RJ, Wang G, Li Z, Liu X, Ren M (2017) Parallelizing degraded read for erasure coded cloud storage systems using collective communications. In: Trustcom/BigDatase/ISPA

26.

Li R, Li X, Lee PPC, Huang Q (2017) Repair pipelining for erasure-coded storage. In: USENIX Technical Conference

27.

Ernvall T, Rouayheb SE, Hollanti C, Poor HV (2013) Capacity and security of heterogeneous distributed storage systems. IEEE J Sel Areas Commun 3112:2701–2709CrossRef

28.

Li J, Yang S, Wang X, Li B (2010) Tree-structured data regeneration in distributed storage systems with regenerating codes. In: Conference on Information Communications, pp 2892–2900

29.

Luo H, Huang J, Cao Q, Xie C (2014) LaRS: a load-aware recovery scheme for heterogeneous erasure-coded storage clusters. In: IEEE International Conference on Networking, Architecture, and Storage, pp 168–175

30.

Xie P, Huang J, Qin X, Xie C (2017) SmartRec: fast recovery from single failures in heterogeneous RAID-coded storage systems. Comput J 616:896–911

31.

Noel RR, Lama P (2017) Taming performance hotspots in cloud storage with dynamic load redistribution. In: IEEE International Conference on Cloud Computing, pp 42–49

32.

Gudu D, Hardt M, Streit A (2014) Evaluating the performance and scalability of the Ceph distributed storage system. In: 2014 IEEE International Conference on Big Data (Big Data). IEEE, pp 177–182

Title: NADE: nodes performance awareness and accurate distance evaluation for degraded read in heterogeneous distributed erasure code-based storage
Authors: Xingjun Zhang
Yi Cai
Yunfei Liu
Zhiwei Xu
Xiaoshe Dong
Publication date: 09-05-2019
Publisher: Springer US
Published in: The Journal of Supercomputing / Issue 7/2020
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-019-02879-6

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 7/2020

Programming bsp and multi-bsp algorithms in ml

dOCAL: high-level distributed programming with OpenCL and CUDA

Hybrid CPU–GPU execution support in the skeleton programming framework SkePU

Mesh convergence test system in integrated platform environment for finite element analysis

FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations

Clustering of tourist routes for individual tourists using sequential pattern mining

Premium Partner