Skip to main content
Erschienen in: The Journal of Supercomputing 9/2020

16.01.2020

Dynamic replication factor model for Linux containers-based cloud systems

verfasst von: Heithem Abbes, Thouraya Louati, Christophe Cérin

Erschienen in: The Journal of Supercomputing | Ausgabe 9/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Infrastructure-as-a-service container-based virtualization is gaining interest as a platform for running distributed applications. With increasing scale of cloud architectures, faults are becoming a frequent occurrence, which makes availability true challenge. Replication is a method to survive failures whether of checkpoints, containers or data to increase their availability. In fact, following a node failure, fault-tolerant cloud systems restart failed containers on a new node from distributed images of containers (or checkpoints). With a high failure rate, we can lose some replicas. It is interesting to increase the replication factor in some cases and finding the trade-off between restarting all failed containers and storage overhead. This paper addresses the issue of adapting the replication factor and contributes with a novel replication factor modeling approach, which is able to predict the right replication factor using prediction techniques. These techniques are based on experimental modeling, which analyze collected data related to different executions. We have used regression technique to find the relation between availability and replicas number. Experiments on the Grid’5000 testbed demonstrate the benefits of our proposal to satisfy the availability requirement, using a real fault-tolerant cloud system.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Marinescu DC (2017) Cloud computing: theory and practice. Morgan Kaufmann, Burlington Marinescu DC (2017) Cloud computing: theory and practice. Morgan Kaufmann, Burlington
2.
Zurück zum Zitat Mell P, Grance T (2011) The NIST definition of cloud computing. National Institute of Standards & Technology, Gaithersburg, MD, USACrossRef Mell P, Grance T (2011) The NIST definition of cloud computing. National Institute of Standards & Technology, Gaithersburg, MD, USACrossRef
3.
Zurück zum Zitat Joy AM (2015) Performance comparison between linux containers and virtual machines. In: 2015 International Conference on Advances in Computer Engineering and Applications (ICACEA), IEEE, pp 342–346 Joy AM (2015) Performance comparison between linux containers and virtual machines. In: 2015 International Conference on Advances in Computer Engineering and Applications (ICACEA), IEEE, pp 342–346
4.
Zurück zum Zitat Martin JP, Kandasamy A, Chandrasekaran K (2018) Exploring the support for high performance applications in the container runtime environment. Hum Centric Comput Inf Sci 8(1):1CrossRef Martin JP, Kandasamy A, Chandrasekaran K (2018) Exploring the support for high performance applications in the container runtime environment. Hum Centric Comput Inf Sci 8(1):1CrossRef
5.
Zurück zum Zitat Vishwanath KV, Nagappan N (2010) Characterizing cloud computing hardware reliability. In: Proceedings of the 1st ACM Symposium on Cloud Computing, ACM, pp 193–204 Vishwanath KV, Nagappan N (2010) Characterizing cloud computing hardware reliability. In: Proceedings of the 1st ACM Symposium on Cloud Computing, ACM, pp 193–204
6.
Zurück zum Zitat Jhawar R, Piuri V (2017) Fault tolerance and resilience in cloud computing environments. In: Computer and information security handbook (3rd edn), Elsevier, pp 165–181 Jhawar R, Piuri V (2017) Fault tolerance and resilience in cloud computing environments. In: Computer and information security handbook (3rd edn), Elsevier, pp 165–181
7.
Zurück zum Zitat Cheraghlou MN, Khadem-Zadeh A, Haghparast M (2016) A survey of fault tolerance architecture in cloud computing. J Netw Comput Appl 61:81–92CrossRef Cheraghlou MN, Khadem-Zadeh A, Haghparast M (2016) A survey of fault tolerance architecture in cloud computing. J Netw Comput Appl 61:81–92CrossRef
8.
Zurück zum Zitat Milani BA, Navimipour NJ (2016) A comprehensive review of the data replication techniques in the cloud environments: major trends and future directions. J Netw Comput Appl 64:229–238CrossRef Milani BA, Navimipour NJ (2016) A comprehensive review of the data replication techniques in the cloud environments: major trends and future directions. J Netw Comput Appl 64:229–238CrossRef
9.
Zurück zum Zitat Louati T, Abbes H, Cérin C, Jemni M (2018) Lxcloud-cr: towards linux containers distributed hash table based checkpoint-restart. J Parallel Distrib Comput 111:187–205CrossRef Louati T, Abbes H, Cérin C, Jemni M (2018) Lxcloud-cr: towards linux containers distributed hash table based checkpoint-restart. J Parallel Distrib Comput 111:187–205CrossRef
10.
Zurück zum Zitat Louati T, Abbes H, Cérin C (2018) Lxcloudft: towards high availability, fault tolerant cloud system based linux containers. J Parallel Distrib Comput 122:51–69CrossRef Louati T, Abbes H, Cérin C (2018) Lxcloudft: towards high availability, fault tolerant cloud system based linux containers. J Parallel Distrib Comput 122:51–69CrossRef
11.
Zurück zum Zitat Zhou Y, Li N, Li H, Zhang Y (2015) Regression cloud models and their applications in energy consumption of data center. J Electr Comput Eng 2015:143071:1–143071:9 Zhou Y, Li N, Li H, Zhang Y (2015) Regression cloud models and their applications in energy consumption of data center. J Electr Comput Eng 2015:143071:1–143071:9
12.
Zurück zum Zitat Hightower K, Burns B, Beda J (2017) Kubernetes: up and running dive into the future of infrastructure, 1st edition. O’Reilly Media, Inc, ISBN: 1491935677 Hightower K, Burns B, Beda J (2017) Kubernetes: up and running dive into the future of infrastructure, 1st edition. O’Reilly Media, Inc, ISBN: 1491935677
13.
Zurück zum Zitat Netto HV, Lung LC, Correia M, Luiz AF, de Souza LMS (2017) State machine replication in containers managed by kubernetes. J Syst Archit 73:53–59CrossRef Netto HV, Lung LC, Correia M, Luiz AF, de Souza LMS (2017) State machine replication in containers managed by kubernetes. J Syst Archit 73:53–59CrossRef
17.
Zurück zum Zitat Hassan WU, Lemay M, Aguse N, Bates A, Moyer T (2018) Towards scalable cluster auditing through grammatical inference over provenance graphs. In: Network and Distributed Systems Security Symposium Hassan WU, Lemay M, Aguse N, Bates A, Moyer T (2018) Towards scalable cluster auditing through grammatical inference over provenance graphs. In: Network and Distributed Systems Security Symposium
20.
Zurück zum Zitat Wei Q, Veeravalli B, Gong B, Zeng L, Feng D (2010) Cdrm: a cost-effective dynamic replication management scheme for cloud storage cluster. In: 2010 IEEE International Conference on Cluster Computing (CLUSTER), IEEE, pp 188–196 Wei Q, Veeravalli B, Gong B, Zeng L, Feng D (2010) Cdrm: a cost-effective dynamic replication management scheme for cloud storage cluster. In: 2010 IEEE International Conference on Cluster Computing (CLUSTER), IEEE, pp 188–196
21.
Zurück zum Zitat Wang M, Li B, Zhao Y, Pu G (2014) Formalizing google file system. In: 2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing (PRDC), IEEE, pp 190–191 Wang M, Li B, Zhao Y, Pu G (2014) Formalizing google file system. In: 2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing (PRDC), IEEE, pp 190–191
28.
Zurück zum Zitat Louati T, Abbes H, Cérin C, Jemni M (2017) Gc-cr: a decentralized garbage collector component for checkpointing in clouds. In: 2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), IEEE, pp 97–104 Louati T, Abbes H, Cérin C, Jemni M (2017) Gc-cr: a decentralized garbage collector component for checkpointing in clouds. In: 2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), IEEE, pp 97–104
29.
Zurück zum Zitat Zhou Y, Li N, Li H, Zhang Y (2015) Regression cloud models and their applications in energy consumption of data center. JECE 2015:1:1–1:1 Zhou Y, Li N, Li H, Zhang Y (2015) Regression cloud models and their applications in energy consumption of data center. JECE 2015:1:1–1:1
30.
Zurück zum Zitat Rajamani K, Sheela D (2018) Data mining techniques and algorithms in cloud environment-a review. Int J Pure Appl Math 119:599–602 Rajamani K, Sheela D (2018) Data mining techniques and algorithms in cloud environment-a review. Int J Pure Appl Math 119:599–602
37.
Zurück zum Zitat Alapati SR (2018) Cassandra on Docker, Apache Spark, and the Cassandra Cluster Manager. In: Expert Apache Cassandra Administration, Springer, pp 249–281 Alapati SR (2018) Cassandra on Docker, Apache Spark, and the Cassandra Cluster Manager. In: Expert Apache Cassandra Administration, Springer, pp 249–281
41.
Zurück zum Zitat Yassir S, Mostapha Z, Najlae K (2018) The impact of checkpointing interval selection on the scheduling performance of Hadoop framework. In: 2018 6th International Conference on Multimedia Computing and Systems (ICMCS), IEEE, pp 1–6 Yassir S, Mostapha Z, Najlae K (2018) The impact of checkpointing interval selection on the scheduling performance of Hadoop framework. In: 2018 6th International Conference on Multimedia Computing and Systems (ICMCS), IEEE, pp 1–6
Metadaten
Titel
Dynamic replication factor model for Linux containers-based cloud systems
verfasst von
Heithem Abbes
Thouraya Louati
Christophe Cérin
Publikationsdatum
16.01.2020
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 9/2020
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-020-03158-5

Weitere Artikel der Ausgabe 9/2020

The Journal of Supercomputing 9/2020 Zur Ausgabe