Skip to main content
Top
Published in: Cluster Computing 1/2019

24-07-2018

SORD: a new strategy of online replica deduplication in Cloud-P2P

Authors: ShengYao Sun, WenBin Yao, XiaoYong Li

Published in: Cluster Computing | Issue 1/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In online Cloud-P2P system, more replicas can lead to lower access delay but more maintenance overhead and vice versa. The traditional strategies of online replica deduplication usually utilize the method of dynamic threshold to delete the redundant replicas. Since the replicas access amount has varied over time, and every replica can bear a certain amount of requests, the replica of being deleted may impact on other nodes, lead to these nodes overload, deteriorating the system performance. But this impact is not paid enough attention in the traditional strategy. To deal with the problem, this paper proposes a new strategy of online replica deduplication (SORD), achieving to reduce the impact on other nodes when deleting a redundant replica. In order to reduce the impact, SORD adopts the method of prediction evaluation to delete the redundant replica. Before deleting a replica, it applies the method of fuzzy clustering analysis to get the optimal deletion replica from the file’s replica set. Based on the historical visiting information of the optimal deletion replica and the capacity of nodes, SORD evaluates the impact on other nodes to decide whether a replica can be deleted. Extensive experiments demonstrate that SORD obtains superior performances in access latency around 5–15% on average and better load balance than other similar methods. Meanwhile, it can remove about 65% redundant replicas.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Li, Z., Huang, Y., Liu, G., et al.: Challenges, designs, and performances of large-scale open-P2SP content distribution. IEEE Trans. Parallel Distrib. Syst. 24(11), 2181–2191 (2013)CrossRef Li, Z., Huang, Y., Liu, G., et al.: Challenges, designs, and performances of large-scale open-P2SP content distribution. IEEE Trans. Parallel Distrib. Syst. 24(11), 2181–2191 (2013)CrossRef
2.
go back to reference Liu, G., Shen, H., Chandler, H.: Selective data replication for online social networks with distributed datacenters. In: IEEE International Conference on Network Protocols. IEEE, pp. 1–10 (2013) Liu, G., Shen, H., Chandler, H.: Selective data replication for online social networks with distributed datacenters. In: IEEE International Conference on Network Protocols. IEEE, pp. 1–10 (2013)
3.
go back to reference Shen, H., Li, Z., Chen, K.: Social-P2P: an online social network based P2P file sharing system. IEEE Trans. Parallel Distrib. Syst. 26(10), 2874–2889 (2015)CrossRef Shen, H., Li, Z., Chen, K.: Social-P2P: an online social network based P2P file sharing system. IEEE Trans. Parallel Distrib. Syst. 26(10), 2874–2889 (2015)CrossRef
5.
go back to reference Song, J., Deng, H.J., You, J.L.: NOVA: A P2P-cloud Vod system for IPTV with collaborative pre-deployment module based on recommendation scheme. Adv. Mater. Res. 756–759, 1566–1570 (2013)CrossRef Song, J., Deng, H.J., You, J.L.: NOVA: A P2P-cloud Vod system for IPTV with collaborative pre-deployment module based on recommendation scheme. Adv. Mater. Res. 756–759, 1566–1570 (2013)CrossRef
6.
go back to reference Rocha, V., Kon, F., Cobe, R., et al.: A hybrid cloud-P2P architecture for multimedia information retrieval on VoD services. Computing 98(1), 73–92 (2016)MathSciNetCrossRef Rocha, V., Kon, F., Cobe, R., et al.: A hybrid cloud-P2P architecture for multimedia information retrieval on VoD services. Computing 98(1), 73–92 (2016)MathSciNetCrossRef
7.
go back to reference Wu, C., Li, B., Zhao, S.: Multi-channel live P2P streaming: refocusing on servers. In: INFOCOM 2008 Conference on Computer Communications, IEEE Xplore, pp. 1355–1363 (2008) Wu, C., Li, B., Zhao, S.: Multi-channel live P2P streaming: refocusing on servers. In: INFOCOM 2008 Conference on Computer Communications, IEEE Xplore, pp. 1355–1363 (2008)
8.
go back to reference Shen, H.: An efficient and adaptive decentralized file replication algorithm in P2P file sharing systems. IEEE Trans. Parallel Distrib. Syst. 21(6), 827–840 (2009)CrossRef Shen, H.: An efficient and adaptive decentralized file replication algorithm in P2P file sharing systems. IEEE Trans. Parallel Distrib. Syst. 21(6), 827–840 (2009)CrossRef
9.
go back to reference Shen, H., Liu, G., Chandler, H.: Swarm intelligence based file replication and consistency maintenance in structured P2P file sharing systems. IEEE Trans. Comput. 64(10), 2953–2967 (2015)MathSciNetCrossRefMATH Shen, H., Liu, G., Chandler, H.: Swarm intelligence based file replication and consistency maintenance in structured P2P file sharing systems. IEEE Trans. Comput. 64(10), 2953–2967 (2015)MathSciNetCrossRefMATH
10.
go back to reference Gill, N.K., Singh, S.: A dynamic, cost-aware, optimized data replication strategy for heterogeneous cloud data centers. Future Gener. Comput. Syst. 65, 10–32 (2016)CrossRef Gill, N.K., Singh, S.: A dynamic, cost-aware, optimized data replication strategy for heterogeneous cloud data centers. Future Gener. Comput. Syst. 65, 10–32 (2016)CrossRef
11.
go back to reference Sun, X., Li, Q.Z., Zhao, P., Wang, K.X., Pan, F.: An optimized replica distribution method for peer-to-peer network. Chin. J. Comput. 37, 1424–1433 (2014) Sun, X., Li, Q.Z., Zhao, P., Wang, K.X., Pan, F.: An optimized replica distribution method for peer-to-peer network. Chin. J. Comput. 37, 1424–1433 (2014)
12.
go back to reference Shen, H., Liu, G.: A lightweight and cooperative multifactor considered file replication method in structured P2P systems. IEEE Trans. Comput. 62(11), 2115–2130 (2013)MathSciNetCrossRefMATH Shen, H., Liu, G.: A lightweight and cooperative multifactor considered file replication method in structured P2P systems. IEEE Trans. Comput. 62(11), 2115–2130 (2013)MathSciNetCrossRefMATH
13.
go back to reference Shen, H., Liu, G.: A geographically aware poll-based distributed file consistency maintenance method for P2P systems. IEEE Trans. Parallel Distrib. Syst. 24(11), 2148–2159 (2013)CrossRef Shen, H., Liu, G.: A geographically aware poll-based distributed file consistency maintenance method for P2P systems. IEEE Trans. Parallel Distrib. Syst. 24(11), 2148–2159 (2013)CrossRef
14.
go back to reference Shen, H.: IRM: integrated file replication and consistency maintenance in P2P systems. IEEE Trans. Parallel Distrib. Syst. 21(1), 100–113 (2010)MathSciNetCrossRef Shen, H.: IRM: integrated file replication and consistency maintenance in P2P systems. IEEE Trans. Parallel Distrib. Syst. 21(1), 100–113 (2010)MathSciNetCrossRef
15.
go back to reference Xiong, J., Hu, Y., Li, G., et al.: Metadata distribution and consistency techniques for large-scale cluster file systems. IEEE Trans. Parallel Distrib. Syst. 22(5), 803–816 (2011)CrossRef Xiong, J., Hu, Y., Li, G., et al.: Metadata distribution and consistency techniques for large-scale cluster file systems. IEEE Trans. Parallel Distrib. Syst. 22(5), 803–816 (2011)CrossRef
16.
go back to reference Wang, C., Chow, S.S.M., Wang, Q., et al.: Privacy-preserving public auditing for secure cloud storage. IEEE Trans. Comput. 2009(2), 579 (2009)MathSciNetMATH Wang, C., Chow, S.S.M., Wang, Q., et al.: Privacy-preserving public auditing for secure cloud storage. IEEE Trans. Comput. 2009(2), 579 (2009)MathSciNetMATH
19.
go back to reference Lian Q, Chen W, Zhang Z. On the Impact of Replica Placement to the Reliability of Distributed Brick Storage Systems. In: Proceedings of the IEEE International Conference on Distributed Computing Systems. ICDCS 2005. IEEE, pp. 187–196 (2005) Lian Q, Chen W, Zhang Z. On the Impact of Replica Placement to the Reliability of Distributed Brick Storage Systems. In: Proceedings of the IEEE International Conference on Distributed Computing Systems. ICDCS 2005. IEEE, pp. 187–196 (2005)
20.
go back to reference Walters, J.P., Chaudhary, V.: Replication-based fault tolerance for MPI applications. IEEE Trans. Parallel Distrib. Syst. 20(7), 997–1010 (2009)CrossRef Walters, J.P., Chaudhary, V.: Replication-based fault tolerance for MPI applications. IEEE Trans. Parallel Distrib. Syst. 20(7), 997–1010 (2009)CrossRef
21.
go back to reference Nukarapu, D.T., Tang, B., Wang, L., et al.: Data replication in data intensive scientific applications with performance guarantee. IEEE Trans. Parallel Distrib. Syst. 22(8), 1299–1306 (2011)CrossRef Nukarapu, D.T., Tang, B., Wang, L., et al.: Data replication in data intensive scientific applications with performance guarantee. IEEE Trans. Parallel Distrib. Syst. 22(8), 1299–1306 (2011)CrossRef
23.
go back to reference Hsiao, H.C., Chung, H.Y., Shen, H., et al.: Load rebalancing for distributed file systems in clouds. IEEE Trans. Parallel Distrib. Syst. 24(5), 951–962 (2013)CrossRef Hsiao, H.C., Chung, H.Y., Shen, H., et al.: Load rebalancing for distributed file systems in clouds. IEEE Trans. Parallel Distrib. Syst. 24(5), 951–962 (2013)CrossRef
24.
go back to reference Li, J., Li, Y.K., Chen, X., et al.: A hybrid cloud approach for secure authorized deduplication. IEEE Trans. Parallel Distrib. Syst. 26(5), 1206–1216 (2015)CrossRef Li, J., Li, Y.K., Chen, X., et al.: A hybrid cloud approach for secure authorized deduplication. IEEE Trans. Parallel Distrib. Syst. 26(5), 1206–1216 (2015)CrossRef
25.
go back to reference Tan, Y., Yan, Z., Feng, D., et al.: De-Frag: an efficient scheme to improve deduplication performance via reducing data placement de-linearization. Clust. Comput. 18(1), 79–92 (2015)CrossRef Tan, Y., Yan, Z., Feng, D., et al.: De-Frag: an efficient scheme to improve deduplication performance via reducing data placement de-linearization. Clust. Comput. 18(1), 79–92 (2015)CrossRef
26.
go back to reference Hess, J., Kalaba, R.: Leveraging data deduplication to improve the performance of primary storage systems in the cloud. IEEE Trans. Comput. 65(6), 1775–1788 (2016)MathSciNetCrossRef Hess, J., Kalaba, R.: Leveraging data deduplication to improve the performance of primary storage systems in the cloud. IEEE Trans. Comput. 65(6), 1775–1788 (2016)MathSciNetCrossRef
27.
go back to reference Fu, M., Feng, D., Hua, Y., et al.: Reducing fragmentation for in-line deduplication backup storage via exploiting backup history and cache knowledge. IEEE Trans. Parallel Distrib. Syst. 27(3), 1–1 (2016)CrossRef Fu, M., Feng, D., Hua, Y., et al.: Reducing fragmentation for in-line deduplication backup storage via exploiting backup history and cache knowledge. IEEE Trans. Parallel Distrib. Syst. 27(3), 1–1 (2016)CrossRef
28.
29.
go back to reference Li, W., Yang, Y., Yuan, D.: Ensuring cloud data reliability with minimum replication by proactive replica checking. IEEE Trans. Comput. 65(5), 1–1 (2015)MathSciNetMATH Li, W., Yang, Y., Yuan, D.: Ensuring cloud data reliability with minimum replication by proactive replica checking. IEEE Trans. Comput. 65(5), 1–1 (2015)MathSciNetMATH
30.
go back to reference Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. ACM Sigops Oper. Syst. Rev. 37(5), 29–43 (2003)CrossRef Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. ACM Sigops Oper. Syst. Rev. 37(5), 29–43 (2003)CrossRef
34.
go back to reference Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)CrossRefMATH Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)CrossRefMATH
35.
go back to reference Ross, T.J.: Fuzzy Logic with Engineering Applications, 3rd edn. McGraw-Hill, New York (1995)MATH Ross, T.J.: Fuzzy Logic with Engineering Applications, 3rd edn. McGraw-Hill, New York (1995)MATH
37.
go back to reference Behounek, L.: Logical foundations of fuzzy mathematics. Fuzzy Sets Syst. (2017) Behounek, L.: Logical foundations of fuzzy mathematics. Fuzzy Sets Syst. (2017)
38.
go back to reference Martin, M.W.: Interval-partitioning method for multidimensional data. US Patent US6003036 (1999) Martin, M.W.: Interval-partitioning method for multidimensional data. US Patent US6003036 (1999)
39.
go back to reference Biswas, G., Weinberg, J.B., Fisher, D.H.: ITERATE: a conceptual clustering algorithm for data mining. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 28(2), 219–230 (1998)CrossRef Biswas, G., Weinberg, J.B., Fisher, D.H.: ITERATE: a conceptual clustering algorithm for data mining. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 28(2), 219–230 (1998)CrossRef
40.
go back to reference Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)CrossRef Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)CrossRef
43.
go back to reference Bandara, H.M.N.D., Jayasumana, A.P.: Collaborative applications over peer-to-peer systems challenges and solutions. Peer-to-Peer Netw. Appl. 6(3), 257–276 (2013)CrossRef Bandara, H.M.N.D., Jayasumana, A.P.: Collaborative applications over peer-to-peer systems challenges and solutions. Peer-to-Peer Netw. Appl. 6(3), 257–276 (2013)CrossRef
44.
go back to reference Chervenak, A., Bharathi, S.: Peer-to-Peer Approaches to Grid Resource Discovery. Making Grids Work, pp. 59–76. Springer, New York (2008) Chervenak, A., Bharathi, S.: Peer-to-Peer Approaches to Grid Resource Discovery. Making Grids Work, pp. 59–76. Springer, New York (2008)
45.
46.
go back to reference Belalem, G., Slimani, Y.: Consistency management for data grid in OptorSim simulator. In: International Conference on Multimedia and Ubiquitous Engineering. IEEE, pp. 554–560 (2007) Belalem, G., Slimani, Y.: Consistency management for data grid in OptorSim simulator. In: International Conference on Multimedia and Ubiquitous Engineering. IEEE, pp. 554–560 (2007)
Metadata
Title
SORD: a new strategy of online replica deduplication in Cloud-P2P
Authors
ShengYao Sun
WenBin Yao
XiaoYong Li
Publication date
24-07-2018
Publisher
Springer US
Published in
Cluster Computing / Issue 1/2019
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-018-2819-2

Other articles of this Issue 1/2019

Cluster Computing 1/2019 Go to the issue

Premium Partner