Skip to main content
Top

2017 | OriginalPaper | Chapter

Privacy-Aware Data Sharing in a Tree-Based Categorical Clustering Algorithm

Authors : Mina Sheikhalishahi, Mohamed Mejri, Nadia Tawbi, Fabio Martinelli

Published in: Foundations and Practice of Security

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Despite being one of the most common approaches in unsupervised data analysis, a very small literature exists in applying formal methods to address data mining problems. This paper applies an abstract representation of a hierarchical categorical clustering algorithm (CCTree) to solve the problem of privacy-aware data clustering in distributed agents. The proposed methodology is based on rewriting systems, and automatically generates a global structure of the clusters. We prove that the proposed approach improves the time complexity. Moreover a metric is provided to measure the privacy gain after revealing the CCTree result. Furthermore, we discuss under what condition the CCTree clustering in distributed framework produces the comparable result to the centralized one.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Berkhin, P.: A survey of clustering data mining techniques. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data, pp. 25–71. Springer, Heidelberg (2006)CrossRef Berkhin, P.: A survey of clustering data mining techniques. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data, pp. 25–71. Springer, Heidelberg (2006)CrossRef
2.
go back to reference Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. SIGKDD Explor. Newsl. 4(2), 28–34 (2002)CrossRef Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. SIGKDD Explor. Newsl. 4(2), 28–34 (2002)CrossRef
3.
go back to reference Dershowitz, N., Jouannaud, J.: Rewrite systems. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, vol. b, pp. 243–320. MIT Press, Cambridge (1990) Dershowitz, N., Jouannaud, J.: Rewrite systems. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, vol. b, pp. 243–320. MIT Press, Cambridge (1990)
4.
go back to reference Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. 42(4), 14:1–14:53 (2010)CrossRef Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. 42(4), 14:1–14:53 (2010)CrossRef
5.
go back to reference Kantarcioǧlu, M., Jin, J., Clifton, C.: When do data mining results violate privacy? In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2004, pp. 599–604. ACM, New York (2004) Kantarcioǧlu, M., Jin, J., Clifton, C.: When do data mining results violate privacy? In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2004, pp. 599–604. ACM, New York (2004)
6.
go back to reference Kriegel, H.P., Kroger, P., Pryakhin, A., Schubert, M.: Effective and efficient distributed model-based clustering. In: Fifth IEEE International Conference on Data Mining (2005) Kriegel, H.P., Kroger, P., Pryakhin, A., Schubert, M.: Effective and efficient distributed model-based clustering. In: Fifth IEEE International Conference on Data Mining (2005)
8.
go back to reference Lindell, Y., Pinkas, B.: Secure multiparty computation for privacy-preserving data mining (2008) Lindell, Y., Pinkas, B.: Secure multiparty computation for privacy-preserving data mining (2008)
9.
go back to reference Martinelli, F., Saracino, A., Sheikhalishahi, M.: Modeling privacy aware information sharing systems: a formal and general approach. In: 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (2016) Martinelli, F., Saracino, A., Sheikhalishahi, M.: Modeling privacy aware information sharing systems: a formal and general approach. In: 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (2016)
10.
go back to reference Oliveira, S.R.M., Zaïane, O.R.: Achieving privacy preservation when sharing data for clustering. In: Jonker, W., Petković, M. (eds.) SDM 2004. LNCS, vol. 3178, pp. 67–82. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30073-1_6 CrossRef Oliveira, S.R.M., Zaïane, O.R.: Achieving privacy preservation when sharing data for clustering. In: Jonker, W., Petković, M. (eds.) SDM 2004. LNCS, vol. 3178, pp. 67–82. Springer, Heidelberg (2004). doi:10.​1007/​978-3-540-30073-1_​6 CrossRef
11.
go back to reference Sheikhalishahi, M., Mejri, M., Tawbi, N.: Clustering spam emails into campaigns. In: Library, S.D. (ed.) 1st Conference on Information Systems Security and Privacy (2015) Sheikhalishahi, M., Mejri, M., Tawbi, N.: Clustering spam emails into campaigns. In: Library, S.D. (ed.) 1st Conference on Information Systems Security and Privacy (2015)
12.
go back to reference Sheikhalishahi, M., Saracino, A., Mejri, M., Tawbi, N., Martinelli, F.: Fast and effective clustering of spam emails based on structural similarity. In: Garcia-Alfaro, J., Kranakis, E., Bonfante, G. (eds.) FPS 2015. LNCS, vol. 9482, pp. 195–211. Springer, Heidelberg (2016). doi:10.1007/978-3-319-30303-1_12 CrossRef Sheikhalishahi, M., Saracino, A., Mejri, M., Tawbi, N., Martinelli, F.: Fast and effective clustering of spam emails based on structural similarity. In: Garcia-Alfaro, J., Kranakis, E., Bonfante, G. (eds.) FPS 2015. LNCS, vol. 9482, pp. 195–211. Springer, Heidelberg (2016). doi:10.​1007/​978-3-319-30303-1_​12 CrossRef
13.
14.
go back to reference Zhan, Z.J.: Privacy-preserving collaborative data mining. Doctoral Dissertation (2006) Zhan, Z.J.: Privacy-preserving collaborative data mining. Doctoral Dissertation (2006)
Metadata
Title
Privacy-Aware Data Sharing in a Tree-Based Categorical Clustering Algorithm
Authors
Mina Sheikhalishahi
Mohamed Mejri
Nadia Tawbi
Fabio Martinelli
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-51966-1_11

Premium Partner