Skip to main content
Log in

A hybrid data replication strategy with fuzzy-based deletion for heterogeneous cloud data centers

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

At present, huge cloud-based applications have put forward higher requests for data center storage. In a large-scale Cloud environment, data replication provides an appropriate solution for managing data files, which improves data reliability and availability. In this paper, we propose a data replication algorithm called hybrid replication strategy (HRS) that is applied into replica placement, selection, and replacement steps. HRS has three main phases and is suitable for replicating data files in cloud. In the first phase, it selects the best site (i.e., that is the most central site with high number of access) for storing new replica to reduce access time. In the second phase, HRS considers the best replica node for users based on different parameters such as CPU process capability, network transmission capability, I/O capability of disks, load, and network latency. In the third phase, the replacement decision is made in order to provide better response time. HRS can ascertain the importance of valuable replicas on the basis of a fuzzy inference system with three input parameters (i.e., number of accesses, cost, and the last time the replica was accessed). The new replication policy is simulated using the CloudSim toolkit package. Our proposed mechanism replicates the data over the cloud nodes reasonably well and is easily implementable in a real environment. Experiment results prove that HRS can significantly enhance availability, performance and load balance for data-intensive applications. In addition, it stands good without increasing additional overheads.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Liu Q, Wang G, Liu X, Peng T, Wu J (2017) Achieving reliable and secure services in cloud computing environments. Comput Electr Eng 59:153–164

    Article  Google Scholar 

  2. Jakóbik A, Grzonk D, Palmieri F (2017) Non-deterministic security driven meta scheduler for distributed cloud organizations. Simul Model Pract Theory 76:67–81

    Article  Google Scholar 

  3. Mishra SK, Puthal D, Sahoo B, Jena SK, Obaidat MS (2017) An adaptive task allocation technique for green cloud computing. J Supercomput 74(1):370–385

    Article  Google Scholar 

  4. Wang T, Zhiyang S, Yu X, Mounir H (2014) Rethinking the data center networking: architecture, network protocols, and resource sharing. IEEE Access 2:1481–1496

    Article  Google Scholar 

  5. Wang T, Mounir H (2016) Presto: Towards efficient online virtual network embedding in virtualized cloud data centers. Comput Netw 106:196–208

    Article  Google Scholar 

  6. Foster I, Zhao Y, Raicu I, Lu S (2008) Cloud computing and grid computing 360-degree compared. In: Grid Computing Environments Workshop, GCE’08, pp 1–10

  7. Rajkumar B, Rajiv R, Calheiros RN (2009) Modeling and simulation of scalable cloud computing environments and the CloudSim toolkit: challenges and opportunities. High Perform Comput Simul 1:1–11

    Google Scholar 

  8. Ghemawat S, Gobioff H, Leung S (2003) The Google file system. In: ACM Symposium on Operating Systems Principles, pp 29–43

  9. Mansouri N, Javidi MMA (2017) survey of dynamic replication strategies for improving response time in data grid environment. AUT J Model Simul 49:239–264

    Google Scholar 

  10. Borthakur D (2007) The Hadoop distributed file system: architecture and design. http://hadoop.apache.org/common/docs/r0.18.3/hdfs_design.html

  11. Feng D, Qin L (2006) Adaptive object placement in object-based storage systems with minimal blocking probability. In: Proceeding of the 20th International Conference on Advanced Information Networking and Application

  12. López-Pires F, Barán B (2017) Many-objective virtual machine placement. J Grid Comput 15(2):161–176

    Article  Google Scholar 

  13. Tao M, Ota O, Dong M (2017) Dependency-aware dependable scheduling workflow applications with active replica placement in the cloud. In: IEEE Transactions on Cloud Computing, p 99

  14. Mansouri N, Kuchaki Rafsanjani M, Javidi MMDPRS (2017) A dynamic popularity aware replication strategy with parallel download scheme in cloud environments. Simul Model Theory 77:177–196

    Article  Google Scholar 

  15. Rahman RM, Barker K, Alhajj R (2006) Replica placement design with static optimality and dynamic maintainability. In: Sixth IEEE International Symposium on Cluster Computing and the Grid, pp 434–437

  16. Shvachko K, Kuang H, Radia S, Chansler R (2010) The Hadoop distributed file system. In: IEEE 26th Symposium on Mass Storage Systems and Technologies, pp 1–10

  17. Mansouri N, Dastghaibyfard GHA (2012) dynamic replica management strategy in data grid. J Netw Comput Appl 35:1297–1303

    Article  Google Scholar 

  18. Ibrahim IA, Dai W, Bassiouni M (2016) Intelligent data placement mechanism for replicas distribution in cloudstorage systems. In: IEEE International Conference on Smart Cloud (SmartCloud), pp 134–139

  19. Mansouri N, Dastghaibyfard GH, Mansouri E (2013) Combination of data replication and scheduling algorithm for improving data availability in data grids. J Netw Comput Appl 36:711–722

    Article  Google Scholar 

  20. Mansouri N, Dastghaibyfard GH (2013) Enhanced dynamic hierarchical replication and weighted scheduling strategy in data grid. J Parallel Distrib Comput 73:534–543

    Article  Google Scholar 

  21. Mansouri N (2016) Adaptive data replication strategy in cloud computing for performance improvement. Front Comput Sci 10(5):925–935

    Article  Google Scholar 

  22. Sun DW, Chang GR, Gao S, Jin LZ, Wang XW (2012) Modeling a dynamic data replication strategy to increase system availability in cloud computing environments. J Comput Sci Technol 27:256–272

    Article  MATH  Google Scholar 

  23. Chang RS, Chang HP (2008) A dynamic data replication strategy using access-weights in data grids. J Supercomput 45(3):277–295

    Article  MathSciNet  Google Scholar 

  24. Kim YH, Jung MJ, Lee CH (2010) Energy-aware real-time task scheduling exploiting temporal locality. IEICE Trans Inform Syst 93(5):1147–1153

    Article  Google Scholar 

  25. Sun DW, Chang GR, Miao C, Jin LZ, Wang XW (2013) Analyzing modeling and evaluating dynamic adaptive fault tolerance strategies in cloud computing environments. J Supercomput 66:193–228

    Article  Google Scholar 

  26. Zhang B, Wang X, Huang M (2014) A PGSA based data replica selection scheme for accessing cloud storage system. Adv Comput Archit 451:140–151

    Google Scholar 

  27. Ding X, You J (2011) Plant growth simulation algorithm. Shanghai People’s Publishing House, Shanghai, pp 1–59

    Google Scholar 

  28. Li B, Song SL, Bezakova I, Cameron KW (2013) EDR: An energy-aware runtime load distribution system for data-intensive applications in the cloud. In: IEEE International Conference on Cluster Computing

  29. Lin JW, Chen CH, Chang JM (2013) QoS-aware data replication for data-intensive applications in cloud computing systems. IEEE Trans Cloud Comput 1:101–115

    Article  Google Scholar 

  30. Long SQ, Zhao YL, Chen W (2014) MORM: a multi-objective optimized replication management strategy for cloud storage cluster. J Syst Architect 60:234–244

    Article  Google Scholar 

  31. Luo Y, Li R, Tian F (2004) Application of artificial immune algorithm to function optimization. Fifth World Congr Intel Control Autom 3:2248–2252

    Article  Google Scholar 

  32. Lou C, Zheng M, Liu X, Li X (2014) Replica selection strategy based on individual QoS sensitivity constraints in cloud environment. Pervasive Comput Netw World 8351:393–399

    Article  Google Scholar 

  33. Kumar KA, Quamar A, Deshpande A, Khuller S (2014) SWORD: workload-aware data placement and replica selection for cloud data management systems. VLDB J 23:845–870

    Article  Google Scholar 

  34. Newman MN (2009) An introduction. Oxford University Press, Oxford

    Google Scholar 

  35. Saleh A, Javidan R, Fatehikhaje MT (2015) A four-phase data replication algorithm for data grid. J Adv Comput Sci Technol 4:163

    Article  Google Scholar 

  36. Bhardwaj T, Chander Sharma S (2018) Fuzzy logic-based elasticity controller for autonomic resource provisioning in parallel scientific applications: a cloud computing perspective. Comput Electr Eng. https://doi.org/10.1016/j.compeleceng.2018.02.050

    Article  Google Scholar 

  37. Dhinesh Babu LD, Venkata KP (2013) Honey bee behavior inspired load balancing of tasks in cloud computing environments. Appl Soft Comput 13:2292–2303

    Article  Google Scholar 

  38. Pérez JM, García-Carballeira F, Carretero J, Calderón A, Fernández J (2010) Branch replication scheme: a new model for data replication in large scale data grids. Future Gener Comput Syst 26:12–20

    Article  Google Scholar 

  39. Dasgupta K, Kumar Mondal J, Dutta P (2013) Optimized video steganography using genetic algorithm. Int Conf Comput Intell Model Tech Appl 10:131–137

    Google Scholar 

  40. Saadat N, Rahmani AM (2012) PDDRA: a new pre-fetching based dynamic data replication algorithm in data grids. Future Gener Comput Syst 28:666–681

    Article  Google Scholar 

  41. Calheiros RN, Ranjan R, Beloglazov A, De Rose CAF, Buyya R (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41:23–50

    Article  Google Scholar 

  42. Howell F, Mcnab R (1998) SimJava: a discrete event simulation library for java. In: Proceedings of the First International Conference on Web-Based Modeling and Simulation

  43. Barroso LA, Clidaras J, Holzle U (2013) The datacenter as a computer: an introduction to the design of warehouse-scale machines, vol 2. Morgan and Claypool Publishers, San Rafael

    Google Scholar 

  44. Kim YJ, Kim BK (2000) Load balancing algorithm of parallel vision processing system for real-time navigation. In Proceedings of 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems, Takamatsu, Japan, pp 1860–1865

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. M. Javidi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mansouri, N., Javidi, M.M. A hybrid data replication strategy with fuzzy-based deletion for heterogeneous cloud data centers. J Supercomput 74, 5349–5372 (2018). https://doi.org/10.1007/s11227-018-2427-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2427-1

Keywords

Navigation