nach oben

The Journal of Supercomputing

Erschienen in:

06.01.2022

OFP-TM: an online VM failure prediction and tolerance model towards high availability of cloud computing environments

verfasst von: Deepika Saxena, Ashutosh Kumar Singh

Erschienen in: The Journal of Supercomputing | Ausgabe 6/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The indispensable collaboration of cloud computing in every digital service has raised its resource usage exponentially. The ever-growing demand of cloud resources evades service availability leading to critical challenges such as cloud outages, SLA violation, and excessive power consumption. Previous approaches have addressed this problem by utilizing multiple cloud platforms or running multiple replicas of a Virtual Machine (VM) resulting into high operational cost. This paper has addressed this alarming problem from a different perspective by proposing a novel \(\mathbb {O}\)nline virtual machine \(\mathbb {F}\)ailure \(\mathbb {P}\)rediction and \(\mathbb {T}\)olerance \(\mathbb {M}\)odel (OFP-TM) with high availability awareness embedded in physical machines as well as virtual machines. The failure-prone VMs are estimated in real-time based on their future resource usage by developing an ensemble approach-based resource predictor. These VMs are assigned to a failure tolerance unit comprising of a resource provision matrix and Selection Box (S-Box) mechanism which triggers the migration of failure-prone VMs and handle any outage beforehand while maintaining the desired level of availability for cloud users. The proposed model is evaluated and compared against existing related approaches by simulating cloud environment and executing several experiments using a real-world workload Google Cluster dataset. Consequently, it has been concluded that OFP-TM improves availability and scales down the number of live VM migrations up to 33.5% and 83.3%, respectively, over without OFP-TM.

Vorheriger Artikel Traffic sign detection based on improved faster R-CNN for autonomous driving

Nächster Artikel A flexible deadline-driven resource provisioning and scheduling algorithm for multiple workflows with VM sharing protocol on WaaS-cloud

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Saxena D, Singh AK, Buyya R (2021) OP-MLB: an online VM prediction based multi-objective load balancing framework for resource management at cloud datacenter. IEEE Trans Cloud Comput. https://doi.org/10.1109/TCC.2021.3059096CrossRef

Singh AK, Saxena D, Kumar J, Gupta V (2021) A quantum approach towards the adaptive prediction of cloud workloads. IEEE Trans Parallel Distrib Syst 32(12):2893–2905. https://doi.org/10.1109/TPDS.2021.3079341CrossRef

Saxena S, Saxena D (2015) EWAS: an enriched workflow scheduling algorithm in cloud computing. In 2015 International Conference on Computing, Communication and Security (ICCCS), pages 1–5. IEEE

Saxena D, Singh AK (2021) Workload forecasting and resource management models based on machine learning for cloud computing environments. arXiv preprint arXiv:2106.15112

Saxena D, Singh AK (2021) OSC-MC: online secure communication model for cloud environment. IEEE Commun Lettvol. 25(9):2844–2848. https://doi.org/10.1109/LCOMM.2021.3086986CrossRef

Singh AK, Saxena D (2021) A cryptography and machine learning based authentication for secure data-sharing in federated cloud services environment. J Appl Secur Res 1–24

Saxena D, Gupta R, Singh AK (2021) A survey and comparative study on multi-cloud architectures: emerging issues and challenges for cloud federation. arXiv preprint arXiv:2108.12831,

Kumar J, Saxena D, Singh AK et al (2020) Biphase adaptive learning-based neural network model for cloud datacenter workload forecasting. Soft Comput 24:14593–14610. https://doi.org/10.1007/s00500-020-04808-9CrossRef

Li Z, Yang Y (2017) A novel network structure with power efficiency and high availability for data centers. IEEE Trans Parallel Distrib Syst 29(2):254–268CrossRef

10.

Saxena D, Chauhan RK, Kait R (2016) Dynamic fair priority optimization task scheduling algorithm in cloud computing: concepts and implementations. Int J Comput Netw Inf Secur 8(2):41

11.

Saxena D, Vaisla KS, Rauthan MS (2018) Abstract model of trusted and secure middleware framework for multi-cloud environment. In: International Conference on Advanced Informatics for Computing Research, pages 469–479. Springer

12.

Saxena D, Singh AK (2020) Security embedded dynamic resource allocation model for cloud data centre. Electron Lett 56(20):1062–1065CrossRef

13.

Saxena D, Gupta I, Kumar J, Singh AK, Wen X (2021) A secure and multiobjective virtual machine placement framework for cloud data center. IEEE Syst J. https://doi.org/10.1109/JSYST.2021.3092521CrossRef

14.

Gupta R, Saxena D, Singh AK (2021) Data security and privacy in cloud computing: concepts and emerging trends. arXiv preprint arXiv:2108.09508

15.

Saxena D, Singh AK (2020) Auto-adaptive learning-based workload forecasting in dynamic cloud environment. Int J Comput Appl 1–11

16.

Saxena D, Singh AK (2020) A proactive autoscaling and energy-efficient VM allocation framework using online multi-resource neural network for cloud data center. Neurocomputing 426:248–264CrossRef

17.

Saxena D, Saxena S (2015) Highly advanced cloudlet scheduling algorithm based on particle swarm optimization. In 2015 Eighth International Conference on Contemporary Computing (IC3), pages 111–116. IEEE

18.

Saxena D, Singh AK (2021) Energy aware resource efficient-(eare) server consolidation framework for cloud datacenter. Advances in communication and computational technology. Springer, Singapore, pp 1455–1464CrossRef

19.

Zhang Q, Li S, Li Z, Xing Y, Yang Z, Dai Y (2015) Charm: a cost-efficient multi-cloud data hosting scheme with high availability. IEEE Trans Cloud Comput 3(3):372–386CrossRef

20.

CRN (2020) Ten biggest cloud outages of 2020. https://www.crn.com/slide-shows/cloud/the-10-biggest-cloud-outages-of-2020/11?itc=refresh

21.

Endo PT, Gonçalves GE, Rosendo D, Gomes D, Santos GL, Moreira ALC, Kelner J, Sadok D, Mahloo M (2017) Highly available clouds: system modeling, evaluations, and open challenges. Research Advances in Cloud Computing. Springer, Singapore, pp 21–53CrossRef

22.

Jammal M, Kanso A, Heidari P, Shami A (2017) Evaluating high availability-aware deployments using stochastic petri net model and cloud scoring selection tool. IEEE Trans Serv Comput 14(1):141–154. https://doi.org/10.1109/TSC.2017.2781730CrossRef

23.

Mukwevho MA, Celik T (2018) Toward a smart cloud: a review of fault-tolerance methods in cloud systems. IEEE Trans Serv Comput

24.

Endo PT, Rodrigues M, Gonçalves GE, Kelner J, Sadok DH, Curescu C (2016) High availability in clouds: systematic review and research challenges. J Cloud Comput 5(1):1–15CrossRef

25.

Gill SS, Buyya R (2018) Failure management for reliable cloud computing: a taxonomy, model, and future directions. Comput Sci Eng 22(3):52–63CrossRef

26.

Jhawar R, Piuri V, Santambrogio M (2012) Fault tolerance management in cloud computing: a system-level perspective. IEEE Syst J 7(2):288–297CrossRef

27.

Costa Carlos HA, Park Y, Rosenburg BS, Cher C-Y, Ryu KD (2014) A system software approach to proactive memory-error avoidance. In SC’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 707–718. IEEE

28.

Marahatta A, Xin Q, Chi C, Zhang F, Liu Z (2020) Pefs: AI-driven prediction based energy-aware fault-tolerant scheduling scheme for cloud data center. IEEE Trans Sustain Comput. https://doi.org/10.1109/TSUSC.2020.3015559CrossRef

29.

Sharma Y, Si W, Sun D, Javadi B (2019) Failure-aware energy-efficient VM consolidation in cloud computing systems. Future Gener Comput Syst 94:620–633CrossRef

30.

Bui D-M, Lee S et al (2018) Early fault detection in IAAS cloud computing based on fuzzy logic and prediction technique. J Supercomput 74(11):5730–5745CrossRef

31.

Nguyen HM, Kalra G, Kim D (2019) Host load prediction in cloud computing using long short-term memory encoder-decoder. J Supercomput 75(11):7592–7605CrossRef

32.

Pinto J, Jain P, Kumar T (2016) Hadoop distributed computing clusters for fault prediction. In: 2016 International Computer Science and Engineering Conference (ICSEC), pages 1–6. IEEE

33.

Xu Y, Sui K, Yao R, Zhang H, Lin Q, Dang Y, Li P, Jiang K, Zhang W, Lou J-G et al. (2018) Improving service availability of cloud systems by predicting disk error. In 2018 \(\{\)USENIX\(\}\) Annual Technical Conference (\(\{\)USENIX\(\} \{\)ATC\(\}\) 18), pages 481–494

34.

Wang J, Bao W, Zhu X, Yang LT, Xiang Y (2014) Festal: fault-tolerant elastic scheduling algorithm for real-time tasks in virtualized clouds. IEEE Trans Comput 64(9):2545–2558MathSciNetCrossRef

35.

Zhu X, Wang J, Guo H, Zhu D, Yang LT, Liu L (2016) Fault-tolerant scheduling for real-time scientific workflows with elastic resource provisioning in virtualized clouds. IEEE Trans Parallel Distrib Syst 27(12):3501–3517CrossRef

36.

Sivagami VM, Easwarakumar KS (2019) An improved dynamic fault tolerant management algorithm during VM migration in cloud data center. Future Gener Comput Syst 98:35–43CrossRef

37.

Vinay K, Kumar SM Dilip, Raghavendra S, Venugopal KR (2018) Cost and fault-tolerant aware resource management for scientific workflows using hybrid instances on clouds. Multimed Tools Appl 77(8):10171–10193CrossRef

38.

Ghoreyshi SM (2013) Energy-efficient resource management of cloud datacenters under fault tolerance constraints. In: 2013 International Green Computing Conference Proceedings, pages 1–6. IEEE

39.

Chunlin L, YaPing W, Yi C, Youlong L (2019) Energy-efficient fault-tolerant replica management policy with deadline and budget constraints in edge-cloud environment. J Netw Comput Appl 143:152–166CrossRef

40.

IBM (1999) Power model. [online]. https://www.ibm.com/

41.

Dell (1999) Power model. [online]. https://www.dell.com/systems/power/hardware/

42.

Amazon (199) Amazon ec2 instances. [online]. https://aws.amazon.com/ec2/instance-types/

43.

Charles R, John W, Hellerstein JL (2011) Google cluster-usage traces: format+ schema. Google Inc., White Paper, pp 1–14

44.

Araujo J, Maciel P, Torquato M, Callou G, Andrade E (2014) Availability evaluation of digital library cloud services. In: 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pages 666–671. IEEE

45.

Santos GL, Endo PT, Gonçalves G, Rosendo D, Gomes D, Kelner J, Sadok D, Mahloo M (2017) Analyzing the it subsystem failure impact on availability of cloud services. In: 2017 IEEE Symposium on Computers and Communications (ISCC), pages 717–723. IEEE

Titel: OFP-TM: an online VM failure prediction and tolerance model towards high availability of cloud computing environments
verfasst von: Deepika Saxena
Ashutosh Kumar Singh
Publikationsdatum: 06.01.2022
Verlag: Springer US
Erschienen in: The Journal of Supercomputing / Ausgabe 6/2022
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-021-04235-z

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 6/2022

A new selfish thing detection method based on Voronoi diagram for Internet of Things

OpenMP as runtime for providing high-level stream parallelism on multi-cores

Design and implementation of efficient QCA full-adders using fault-tolerant majority gates

An reinforcement learning-based speech censorship chatbot system

TFMD-SDVN: a trust framework for misbehavior detection in the edge of software-defined vehicular network

Novel scheme for reducing communication data traffic in advanced metering infrastructure networks

Premium Partner