Skip to main content
Erschienen in: Cluster Computing 1/2019

24.07.2018

SORD: a new strategy of online replica deduplication in Cloud-P2P

verfasst von: ShengYao Sun, WenBin Yao, XiaoYong Li

Erschienen in: Cluster Computing | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In online Cloud-P2P system, more replicas can lead to lower access delay but more maintenance overhead and vice versa. The traditional strategies of online replica deduplication usually utilize the method of dynamic threshold to delete the redundant replicas. Since the replicas access amount has varied over time, and every replica can bear a certain amount of requests, the replica of being deleted may impact on other nodes, lead to these nodes overload, deteriorating the system performance. But this impact is not paid enough attention in the traditional strategy. To deal with the problem, this paper proposes a new strategy of online replica deduplication (SORD), achieving to reduce the impact on other nodes when deleting a redundant replica. In order to reduce the impact, SORD adopts the method of prediction evaluation to delete the redundant replica. Before deleting a replica, it applies the method of fuzzy clustering analysis to get the optimal deletion replica from the file’s replica set. Based on the historical visiting information of the optimal deletion replica and the capacity of nodes, SORD evaluates the impact on other nodes to decide whether a replica can be deleted. Extensive experiments demonstrate that SORD obtains superior performances in access latency around 5–15% on average and better load balance than other similar methods. Meanwhile, it can remove about 65% redundant replicas.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Li, Z., Huang, Y., Liu, G., et al.: Challenges, designs, and performances of large-scale open-P2SP content distribution. IEEE Trans. Parallel Distrib. Syst. 24(11), 2181–2191 (2013)CrossRef Li, Z., Huang, Y., Liu, G., et al.: Challenges, designs, and performances of large-scale open-P2SP content distribution. IEEE Trans. Parallel Distrib. Syst. 24(11), 2181–2191 (2013)CrossRef
2.
Zurück zum Zitat Liu, G., Shen, H., Chandler, H.: Selective data replication for online social networks with distributed datacenters. In: IEEE International Conference on Network Protocols. IEEE, pp. 1–10 (2013) Liu, G., Shen, H., Chandler, H.: Selective data replication for online social networks with distributed datacenters. In: IEEE International Conference on Network Protocols. IEEE, pp. 1–10 (2013)
3.
Zurück zum Zitat Shen, H., Li, Z., Chen, K.: Social-P2P: an online social network based P2P file sharing system. IEEE Trans. Parallel Distrib. Syst. 26(10), 2874–2889 (2015)CrossRef Shen, H., Li, Z., Chen, K.: Social-P2P: an online social network based P2P file sharing system. IEEE Trans. Parallel Distrib. Syst. 26(10), 2874–2889 (2015)CrossRef
5.
Zurück zum Zitat Song, J., Deng, H.J., You, J.L.: NOVA: A P2P-cloud Vod system for IPTV with collaborative pre-deployment module based on recommendation scheme. Adv. Mater. Res. 756–759, 1566–1570 (2013)CrossRef Song, J., Deng, H.J., You, J.L.: NOVA: A P2P-cloud Vod system for IPTV with collaborative pre-deployment module based on recommendation scheme. Adv. Mater. Res. 756–759, 1566–1570 (2013)CrossRef
6.
Zurück zum Zitat Rocha, V., Kon, F., Cobe, R., et al.: A hybrid cloud-P2P architecture for multimedia information retrieval on VoD services. Computing 98(1), 73–92 (2016)MathSciNetCrossRef Rocha, V., Kon, F., Cobe, R., et al.: A hybrid cloud-P2P architecture for multimedia information retrieval on VoD services. Computing 98(1), 73–92 (2016)MathSciNetCrossRef
7.
Zurück zum Zitat Wu, C., Li, B., Zhao, S.: Multi-channel live P2P streaming: refocusing on servers. In: INFOCOM 2008 Conference on Computer Communications, IEEE Xplore, pp. 1355–1363 (2008) Wu, C., Li, B., Zhao, S.: Multi-channel live P2P streaming: refocusing on servers. In: INFOCOM 2008 Conference on Computer Communications, IEEE Xplore, pp. 1355–1363 (2008)
8.
Zurück zum Zitat Shen, H.: An efficient and adaptive decentralized file replication algorithm in P2P file sharing systems. IEEE Trans. Parallel Distrib. Syst. 21(6), 827–840 (2009)CrossRef Shen, H.: An efficient and adaptive decentralized file replication algorithm in P2P file sharing systems. IEEE Trans. Parallel Distrib. Syst. 21(6), 827–840 (2009)CrossRef
9.
Zurück zum Zitat Shen, H., Liu, G., Chandler, H.: Swarm intelligence based file replication and consistency maintenance in structured P2P file sharing systems. IEEE Trans. Comput. 64(10), 2953–2967 (2015)MathSciNetCrossRefMATH Shen, H., Liu, G., Chandler, H.: Swarm intelligence based file replication and consistency maintenance in structured P2P file sharing systems. IEEE Trans. Comput. 64(10), 2953–2967 (2015)MathSciNetCrossRefMATH
10.
Zurück zum Zitat Gill, N.K., Singh, S.: A dynamic, cost-aware, optimized data replication strategy for heterogeneous cloud data centers. Future Gener. Comput. Syst. 65, 10–32 (2016)CrossRef Gill, N.K., Singh, S.: A dynamic, cost-aware, optimized data replication strategy for heterogeneous cloud data centers. Future Gener. Comput. Syst. 65, 10–32 (2016)CrossRef
11.
Zurück zum Zitat Sun, X., Li, Q.Z., Zhao, P., Wang, K.X., Pan, F.: An optimized replica distribution method for peer-to-peer network. Chin. J. Comput. 37, 1424–1433 (2014) Sun, X., Li, Q.Z., Zhao, P., Wang, K.X., Pan, F.: An optimized replica distribution method for peer-to-peer network. Chin. J. Comput. 37, 1424–1433 (2014)
12.
Zurück zum Zitat Shen, H., Liu, G.: A lightweight and cooperative multifactor considered file replication method in structured P2P systems. IEEE Trans. Comput. 62(11), 2115–2130 (2013)MathSciNetCrossRefMATH Shen, H., Liu, G.: A lightweight and cooperative multifactor considered file replication method in structured P2P systems. IEEE Trans. Comput. 62(11), 2115–2130 (2013)MathSciNetCrossRefMATH
13.
Zurück zum Zitat Shen, H., Liu, G.: A geographically aware poll-based distributed file consistency maintenance method for P2P systems. IEEE Trans. Parallel Distrib. Syst. 24(11), 2148–2159 (2013)CrossRef Shen, H., Liu, G.: A geographically aware poll-based distributed file consistency maintenance method for P2P systems. IEEE Trans. Parallel Distrib. Syst. 24(11), 2148–2159 (2013)CrossRef
14.
Zurück zum Zitat Shen, H.: IRM: integrated file replication and consistency maintenance in P2P systems. IEEE Trans. Parallel Distrib. Syst. 21(1), 100–113 (2010)MathSciNetCrossRef Shen, H.: IRM: integrated file replication and consistency maintenance in P2P systems. IEEE Trans. Parallel Distrib. Syst. 21(1), 100–113 (2010)MathSciNetCrossRef
15.
Zurück zum Zitat Xiong, J., Hu, Y., Li, G., et al.: Metadata distribution and consistency techniques for large-scale cluster file systems. IEEE Trans. Parallel Distrib. Syst. 22(5), 803–816 (2011)CrossRef Xiong, J., Hu, Y., Li, G., et al.: Metadata distribution and consistency techniques for large-scale cluster file systems. IEEE Trans. Parallel Distrib. Syst. 22(5), 803–816 (2011)CrossRef
16.
Zurück zum Zitat Wang, C., Chow, S.S.M., Wang, Q., et al.: Privacy-preserving public auditing for secure cloud storage. IEEE Trans. Comput. 2009(2), 579 (2009)MathSciNetMATH Wang, C., Chow, S.S.M., Wang, Q., et al.: Privacy-preserving public auditing for secure cloud storage. IEEE Trans. Comput. 2009(2), 579 (2009)MathSciNetMATH
19.
Zurück zum Zitat Lian Q, Chen W, Zhang Z. On the Impact of Replica Placement to the Reliability of Distributed Brick Storage Systems. In: Proceedings of the IEEE International Conference on Distributed Computing Systems. ICDCS 2005. IEEE, pp. 187–196 (2005) Lian Q, Chen W, Zhang Z. On the Impact of Replica Placement to the Reliability of Distributed Brick Storage Systems. In: Proceedings of the IEEE International Conference on Distributed Computing Systems. ICDCS 2005. IEEE, pp. 187–196 (2005)
20.
Zurück zum Zitat Walters, J.P., Chaudhary, V.: Replication-based fault tolerance for MPI applications. IEEE Trans. Parallel Distrib. Syst. 20(7), 997–1010 (2009)CrossRef Walters, J.P., Chaudhary, V.: Replication-based fault tolerance for MPI applications. IEEE Trans. Parallel Distrib. Syst. 20(7), 997–1010 (2009)CrossRef
21.
Zurück zum Zitat Nukarapu, D.T., Tang, B., Wang, L., et al.: Data replication in data intensive scientific applications with performance guarantee. IEEE Trans. Parallel Distrib. Syst. 22(8), 1299–1306 (2011)CrossRef Nukarapu, D.T., Tang, B., Wang, L., et al.: Data replication in data intensive scientific applications with performance guarantee. IEEE Trans. Parallel Distrib. Syst. 22(8), 1299–1306 (2011)CrossRef
23.
Zurück zum Zitat Hsiao, H.C., Chung, H.Y., Shen, H., et al.: Load rebalancing for distributed file systems in clouds. IEEE Trans. Parallel Distrib. Syst. 24(5), 951–962 (2013)CrossRef Hsiao, H.C., Chung, H.Y., Shen, H., et al.: Load rebalancing for distributed file systems in clouds. IEEE Trans. Parallel Distrib. Syst. 24(5), 951–962 (2013)CrossRef
24.
Zurück zum Zitat Li, J., Li, Y.K., Chen, X., et al.: A hybrid cloud approach for secure authorized deduplication. IEEE Trans. Parallel Distrib. Syst. 26(5), 1206–1216 (2015)CrossRef Li, J., Li, Y.K., Chen, X., et al.: A hybrid cloud approach for secure authorized deduplication. IEEE Trans. Parallel Distrib. Syst. 26(5), 1206–1216 (2015)CrossRef
25.
Zurück zum Zitat Tan, Y., Yan, Z., Feng, D., et al.: De-Frag: an efficient scheme to improve deduplication performance via reducing data placement de-linearization. Clust. Comput. 18(1), 79–92 (2015)CrossRef Tan, Y., Yan, Z., Feng, D., et al.: De-Frag: an efficient scheme to improve deduplication performance via reducing data placement de-linearization. Clust. Comput. 18(1), 79–92 (2015)CrossRef
26.
Zurück zum Zitat Hess, J., Kalaba, R.: Leveraging data deduplication to improve the performance of primary storage systems in the cloud. IEEE Trans. Comput. 65(6), 1775–1788 (2016)MathSciNetCrossRef Hess, J., Kalaba, R.: Leveraging data deduplication to improve the performance of primary storage systems in the cloud. IEEE Trans. Comput. 65(6), 1775–1788 (2016)MathSciNetCrossRef
27.
Zurück zum Zitat Fu, M., Feng, D., Hua, Y., et al.: Reducing fragmentation for in-line deduplication backup storage via exploiting backup history and cache knowledge. IEEE Trans. Parallel Distrib. Syst. 27(3), 1–1 (2016)CrossRef Fu, M., Feng, D., Hua, Y., et al.: Reducing fragmentation for in-line deduplication backup storage via exploiting backup history and cache knowledge. IEEE Trans. Parallel Distrib. Syst. 27(3), 1–1 (2016)CrossRef
28.
Zurück zum Zitat Li, J., Li, J., Xie, D., et al.: Secure auditing and deduplicating data in cloud. IEEE Trans. Comput. 65(8), 2386–2396 (2016)MathSciNetCrossRefMATH Li, J., Li, J., Xie, D., et al.: Secure auditing and deduplicating data in cloud. IEEE Trans. Comput. 65(8), 2386–2396 (2016)MathSciNetCrossRefMATH
29.
Zurück zum Zitat Li, W., Yang, Y., Yuan, D.: Ensuring cloud data reliability with minimum replication by proactive replica checking. IEEE Trans. Comput. 65(5), 1–1 (2015)MathSciNetMATH Li, W., Yang, Y., Yuan, D.: Ensuring cloud data reliability with minimum replication by proactive replica checking. IEEE Trans. Comput. 65(5), 1–1 (2015)MathSciNetMATH
30.
Zurück zum Zitat Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. ACM Sigops Oper. Syst. Rev. 37(5), 29–43 (2003)CrossRef Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. ACM Sigops Oper. Syst. Rev. 37(5), 29–43 (2003)CrossRef
34.
Zurück zum Zitat Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)CrossRefMATH Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)CrossRefMATH
35.
Zurück zum Zitat Ross, T.J.: Fuzzy Logic with Engineering Applications, 3rd edn. McGraw-Hill, New York (1995)MATH Ross, T.J.: Fuzzy Logic with Engineering Applications, 3rd edn. McGraw-Hill, New York (1995)MATH
36.
37.
Zurück zum Zitat Behounek, L.: Logical foundations of fuzzy mathematics. Fuzzy Sets Syst. (2017) Behounek, L.: Logical foundations of fuzzy mathematics. Fuzzy Sets Syst. (2017)
38.
Zurück zum Zitat Martin, M.W.: Interval-partitioning method for multidimensional data. US Patent US6003036 (1999) Martin, M.W.: Interval-partitioning method for multidimensional data. US Patent US6003036 (1999)
39.
Zurück zum Zitat Biswas, G., Weinberg, J.B., Fisher, D.H.: ITERATE: a conceptual clustering algorithm for data mining. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 28(2), 219–230 (1998)CrossRef Biswas, G., Weinberg, J.B., Fisher, D.H.: ITERATE: a conceptual clustering algorithm for data mining. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 28(2), 219–230 (1998)CrossRef
40.
Zurück zum Zitat Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)CrossRef Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)CrossRef
43.
Zurück zum Zitat Bandara, H.M.N.D., Jayasumana, A.P.: Collaborative applications over peer-to-peer systems challenges and solutions. Peer-to-Peer Netw. Appl. 6(3), 257–276 (2013)CrossRef Bandara, H.M.N.D., Jayasumana, A.P.: Collaborative applications over peer-to-peer systems challenges and solutions. Peer-to-Peer Netw. Appl. 6(3), 257–276 (2013)CrossRef
44.
Zurück zum Zitat Chervenak, A., Bharathi, S.: Peer-to-Peer Approaches to Grid Resource Discovery. Making Grids Work, pp. 59–76. Springer, New York (2008) Chervenak, A., Bharathi, S.: Peer-to-Peer Approaches to Grid Resource Discovery. Making Grids Work, pp. 59–76. Springer, New York (2008)
45.
46.
Zurück zum Zitat Belalem, G., Slimani, Y.: Consistency management for data grid in OptorSim simulator. In: International Conference on Multimedia and Ubiquitous Engineering. IEEE, pp. 554–560 (2007) Belalem, G., Slimani, Y.: Consistency management for data grid in OptorSim simulator. In: International Conference on Multimedia and Ubiquitous Engineering. IEEE, pp. 554–560 (2007)
Metadaten
Titel
SORD: a new strategy of online replica deduplication in Cloud-P2P
verfasst von
ShengYao Sun
WenBin Yao
XiaoYong Li
Publikationsdatum
24.07.2018
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 1/2019
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-018-2819-2

Weitere Artikel der Ausgabe 1/2019

Cluster Computing 1/2019 Zur Ausgabe

Premium Partner