Skip to main content
Erschienen in: The Journal of Supercomputing 1/2019

10.12.2015

Towards a delivery scheme for speedup of data backup in distributed storage systems using erasure codes

verfasst von: Pengfei You, Zhen Huang, Yuxing Peng, Changjian Wang, Guofeng Yan

Erschienen in: The Journal of Supercomputing | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Distributed storage systems, built on peer-to-peer networks, can provide large-scale data storage and high data reliability by redundancy. Data backup is the process to store data into a set of redundant storage nodes. Rapid completion of such a process is very critical to maintain system performance. In traditional data backup in distributed systems based on erasure codes, star-structured scheme is used, in which each redundant block is just sent to each target storage node from the source node directly, so the storage throughput and delay are limited by the bottleneck bandwidth, due to bandwidth heterogeneity. The recent “in-network” redundancy generation scheme uses locally repairable property of self-repairing codes to speed up data backup. However, such kind of code does not own maximum distance separable property, thus does not achieve optimal storage efficiency. We still lack a fast backup scheme in distributed systems based on general erasure coding. To this end, we proposed that instead of only focusing on bandwidths between the source node and target nodes, the bandwidths between target storage nodes should be fully taken into account. In our scheme, each redundant data block is divided into some parts according to different proportions and each part of the block is sent to the target storage node via other different storage nodes. The benefit is that spare bandwidths between target storage nodes are used to reduce backup time. We further show how this process can be modeled and derive a formula about the final backup time. We can achieve minimum backup time by solution for classical quadratic programming problem. We conduct both numerical analysis and experimental study. Our experiments shows, the delay reduces 59 %, compared with common star-structured scheme. Meanwhile, the throughput is increased significantly in backup process.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Beaver D, Kumar S, Li HC, Sobel J, Vajgel P (2010) Finding a needle in Haystack: Facebooks photo storage. In: Proceedings of the 9th USENIX conference on operating systems design and implementation (OSDI), vol 10, pp 1–8 Beaver D, Kumar S, Li HC, Sobel J, Vajgel P (2010) Finding a needle in Haystack: Facebooks photo storage. In: Proceedings of the 9th USENIX conference on operating systems design and implementation (OSDI), vol 10, pp 1–8
4.
Zurück zum Zitat Ghemawat S, Gobio H, Leung S (2003) The google file system. ACM Sympos Oper Syst Princ 37:29–43CrossRef Ghemawat S, Gobio H, Leung S (2003) The google file system. ACM Sympos Oper Syst Princ 37:29–43CrossRef
6.
Zurück zum Zitat Hastorun D, Jampani M, Kakulapati G et al (2007) Dynamo: Amazons highly available key-value store. In: ACM SIGOPS operating systems review, vol 41, pp 205–220 Hastorun D, Jampani M, Kakulapati G et al (2007) Dynamo: Amazons highly available key-value store. In: ACM SIGOPS operating systems review, vol 41, pp 205–220
8.
Zurück zum Zitat Liu S, Schulze JP, Herr L, Weekley JD, Zhu B, van Osdol N, Plepys D, Wan M (2011) CineGrid exchange: a workow-based peta-scale distributed storage platform on a high-speed network. Future Gener Comput Syst 27(7):966–976CrossRef Liu S, Schulze JP, Herr L, Weekley JD, Zhu B, van Osdol N, Plepys D, Wan M (2011) CineGrid exchange: a workow-based peta-scale distributed storage platform on a high-speed network. Future Gener Comput Syst 27(7):966–976CrossRef
9.
Zurück zum Zitat Kubiatowicz J, Bindel D, Chen Y, Czerwinski S, Eaton P, Geels D, Gummadi R, Rhea S, Weatherspoon H, Weimer W, Wells C, Zhao B (2000) Oceanstore: an architecture for global-scale persistent storage. SIGPLAN Not 35:190–201CrossRef Kubiatowicz J, Bindel D, Chen Y, Czerwinski S, Eaton P, Geels D, Gummadi R, Rhea S, Weatherspoon H, Weimer W, Wells C, Zhao B (2000) Oceanstore: an architecture for global-scale persistent storage. SIGPLAN Not 35:190–201CrossRef
10.
Zurück zum Zitat Duminuco A, Biersack E (2009) A practical study of regenerating codes for peer-to-peer backup systems. In: Proceedings of the IEEE international conference on distributed computing systems (ICDCS), pp 376–384 Duminuco A, Biersack E (2009) A practical study of regenerating codes for peer-to-peer backup systems. In: Proceedings of the IEEE international conference on distributed computing systems (ICDCS), pp 376–384
11.
Zurück zum Zitat Pamies-Juarez L, Datta A, Oggier FE (2013) In-network redundancy generation for opportunistic speedup of data backup. Future Gener Comput Syst 29(6):1353–1362CrossRef Pamies-Juarez L, Datta A, Oggier FE (2013) In-network redundancy generation for opportunistic speedup of data backup. Future Gener Comput Syst 29(6):1353–1362CrossRef
12.
Zurück zum Zitat Pamies-Juarez L, Datta A, Oggier FE (2013) Data insertion and archiving in erasure-coding based large-scale storage systems. In: Proceedings of the international conference on distributed computing and internet technology (ICDCIT), pp 47–68 Pamies-Juarez L, Datta A, Oggier FE (2013) Data insertion and archiving in erasure-coding based large-scale storage systems. In: Proceedings of the international conference on distributed computing and internet technology (ICDCIT), pp 47–68
13.
Zurück zum Zitat Li J, Li B (2013) Erasure coding for cloud storage systems: a survey. Tsinghua Sci Technol 18(3):259–272CrossRef Li J, Li B (2013) Erasure coding for cloud storage systems: a survey. Tsinghua Sci Technol 18(3):259–272CrossRef
14.
Zurück zum Zitat Reed I, Solomon G (1960) Polynomial codes over certain nite elds. J Soc Ind Appl Math 8:300–304CrossRef Reed I, Solomon G (1960) Polynomial codes over certain nite elds. J Soc Ind Appl Math 8:300–304CrossRef
15.
Zurück zum Zitat Acedanski S, Deb S, Medard M, Koetter R (2005) How good is random linear coding based distributed networked storage? In: Proceedings of the 1st workshop on network coding, theory and applications, pp 1–6 Acedanski S, Deb S, Medard M, Koetter R (2005) How good is random linear coding based distributed networked storage? In: Proceedings of the 1st workshop on network coding, theory and applications, pp 1–6
16.
Zurück zum Zitat Dimakis A, Godfrey P, Wainwright M, Ramchandran K (2007) Network coding for distributed storage systems. In: Proceedings of the INFOCOM, pp 2000–2008 Dimakis A, Godfrey P, Wainwright M, Ramchandran K (2007) Network coding for distributed storage systems. In: Proceedings of the INFOCOM, pp 2000–2008
17.
Zurück zum Zitat Oggier F, Datta A (2011) Self-repairing homomorphic codes for distributed storage systems. In: Proceedings of the international conference on computer communications (INFOCOM), pp 1215–1223 Oggier F, Datta A (2011) Self-repairing homomorphic codes for distributed storage systems. In: Proceedings of the international conference on computer communications (INFOCOM), pp 1215–1223
18.
Zurück zum Zitat Lee S-J, Banerjee S, Sharma P, Yalagandula P, Basu S (2008) Bandwidth-aware routing in overlay networks. In: Proceedings of the 27th conference on computer communications, pp 1732–1740 Lee S-J, Banerjee S, Sharma P, Yalagandula P, Basu S (2008) Bandwidth-aware routing in overlay networks. In: Proceedings of the 27th conference on computer communications, pp 1732–1740
19.
20.
Zurück zum Zitat Brayton RK, Director SW, Hachtel GD, Vidigal L (1979) A new algorithm for statistical circuit design based on quasi-Newton methods and function splitting. IEEE Trans Circuits Syst CAS–26:784–794CrossRef Brayton RK, Director SW, Hachtel GD, Vidigal L (1979) A new algorithm for statistical circuit design based on quasi-Newton methods and function splitting. IEEE Trans Circuits Syst CAS–26:784–794CrossRef
21.
Zurück zum Zitat Fletcher R (2010) The sequential quadratic programming method. Nonlinear optimization. In: Lecture notes in mathematics, pp 165–214 Fletcher R (2010) The sequential quadratic programming method. Nonlinear optimization. In: Lecture notes in mathematics, pp 165–214
22.
24.
Zurück zum Zitat Lee S-J, Sharma P, Banerjee S, Basu S, Fonseca R (2005) Measuring bandwidth between planetlab nodes. In: Passive and active network measurement, pp 292–305 Lee S-J, Sharma P, Banerjee S, Basu S, Fonseca R (2005) Measuring bandwidth between planetlab nodes. In: Passive and active network measurement, pp 292–305
Metadaten
Titel
Towards a delivery scheme for speedup of data backup in distributed storage systems using erasure codes
verfasst von
Pengfei You
Zhen Huang
Yuxing Peng
Changjian Wang
Guofeng Yan
Publikationsdatum
10.12.2015
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 1/2019
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-015-1586-6

Weitere Artikel der Ausgabe 1/2019

The Journal of Supercomputing 1/2019 Zur Ausgabe

Premium Partner