Skip to main content
Top
Published in: Knowledge and Information Systems 3/2015

01-12-2015 | Regular Paper

On the anonymizability of graphs

Authors: Charu C. Aggarwal, Yao Li, Philip S. Yu

Published in: Knowledge and Information Systems | Issue 3/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Many applications such as social networks, recommendation systems, email communication patterns, and other collaborative applications are built on top of graph infrastructures. The data stored on such networks may contain personal information about individuals and may therefore be sensitive from a privacy point of view. Therefore, a natural solution is to remove identifying information from the nodes and perturb the graph structure, so that re-identification becomes difficult. Typical graphs encountered in real applications are sparse. In this paper, we will show that sparse graphs have certain theoretical properties which make them susceptible to re-identification attacks. We design a systematic way to exploit these theoretical properties in order to construct re-identification signatures, which are also known as characteristic vectors. These signatures have the property that they are extremely robust to perturbations, especially for sparse graphs. We use these signatures in order to create an effective attack algorithm. We supplement our theoretical results with experimental tests using a number of algorithms on real data sets. These results confirm that even low levels of anonymization require perturbation levels which are significant enough to result in a massive loss of utility. Our experimental results also show that the true anonymization level of graphs is much lower than is implied by measures such as \(k\)-anonymity. Thus, the results of this paper establish that the problem of graph anonymization has fundamental theoretical barriers which prevent a fully effective solution.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
In the case of many social networking sites, the links are typically symmetric and therefore form an undirected network. In the case of a directed network, it is assumed that both inlinks and outlinks are known.
 
Literature
1.
go back to reference Aggarwal CC, Li Y, Yu P (2011) On the hardness of graph anonymization, ICDM conference Aggarwal CC, Li Y, Yu P (2011) On the hardness of graph anonymization, ICDM conference
2.
go back to reference Aggarwal CC, Yu PS (2008) Privacy-preserving data mining: models and algorithms, Springer, BerlinCrossRef Aggarwal CC, Yu PS (2008) Privacy-preserving data mining: models and algorithms, Springer, BerlinCrossRef
3.
go back to reference Agrawal R, Srikant R (2000) Privacy-preserving data mining. ACM SIGMOD conference Agrawal R, Srikant R (2000) Privacy-preserving data mining. ACM SIGMOD conference
4.
go back to reference Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. ACM PODS conference Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. ACM PODS conference
5.
go back to reference Ahuja R, Orlin J, Magnanti T (1992) Network flows: theory, algorithms, and applications. Prentice Hall, Englewood Cliffs, NJ, USA Ahuja R, Orlin J, Magnanti T (1992) Network flows: theory, algorithms, and applications. Prentice Hall, Englewood Cliffs, NJ, USA
6.
go back to reference Backstrom L, Dwork C, Kleinberg J (2007)Wherefore Art Thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. WWW conference Backstrom L, Dwork C, Kleinberg J (2007)Wherefore Art Thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. WWW conference
7.
go back to reference Bayardo RJ, Agrawal R (2005) Data privacy through optimal k-anonymization. ICDE conference Bayardo RJ, Agrawal R (2005) Data privacy through optimal k-anonymization. ICDE conference
8.
go back to reference Cormode G, Srivastava D, Yu T, Zhang Q (2008) Anonymizing bipartite graph data using safe groupings. VLDB conference Cormode G, Srivastava D, Yu T, Zhang Q (2008) Anonymizing bipartite graph data using safe groupings. VLDB conference
9.
go back to reference Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. ACM KDD conference Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. ACM KDD conference
10.
go back to reference Fung B, Wang K, Yu PS (2007) Anonymizing classification data for privacy preservation. IEEE TKDE, pp 711–725 Fung B, Wang K, Yu PS (2007) Anonymizing classification data for privacy preservation. IEEE TKDE, pp 711–725
11.
go back to reference Garey M, Johnson D (1979) Computers and intractability: a guide to the theory of NP-completeness, Freeman Garey M, Johnson D (1979) Computers and intractability: a guide to the theory of NP-completeness, Freeman
12.
go back to reference Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in social networks, VLDB conference Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in social networks, VLDB conference
13.
go back to reference Hay M, Miklau G, Jensen D, Weis P, Srivastava S (2007) Anonymizing social networks. Technical report 07–19. University of Massachusetts, Amherst Hay M, Miklau G, Jensen D, Weis P, Srivastava S (2007) Anonymizing social networks. Technical report 07–19. University of Massachusetts, Amherst
14.
go back to reference Kifer D, Gehrke J (2006) Injecting utility into anonymized datasets. SIGMOD conference, pp 217–228 Kifer D, Gehrke J (2006) Injecting utility into anonymized datasets. SIGMOD conference, pp 217–228
15.
go back to reference LeFevre K, DeWitt D, Ramakrishnan R (2006) Mondrian multidimensional k-anonymity. ICDE conference LeFevre K, DeWitt D, Ramakrishnan R (2006) Mondrian multidimensional k-anonymity. ICDE conference
16.
go back to reference Liu K, Terzi E (2008) Towards identity anonymization on graphs. ACM SIGMOD conference Liu K, Terzi E (2008) Towards identity anonymization on graphs. ACM SIGMOD conference
17.
go back to reference Machanavajjhala A A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) l-Diversity: privacy beyond k-anonymity. ICDE conference Machanavajjhala A A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) l-Diversity: privacy beyond k-anonymity. ICDE conference
18.
go back to reference Samarati P (2001) Protecting respondents identities in microdata release. IEEE TKDE 13(6):1010–1027 Samarati P (2001) Protecting respondents identities in microdata release. IEEE TKDE 13(6):1010–1027
19.
go back to reference Tassa T, Cohen D (2013) Anonymization of centralized and distributed social networks by sequential clustering. IEEE TKDE 25:311–324 Tassa T, Cohen D (2013) Anonymization of centralized and distributed social networks by sequential clustering. IEEE TKDE 25:311–324
20.
go back to reference Vuokko N, Terzi E (2010) Reconstructing randomized social networks, SDM Conf., Vuokko N, Terzi E (2010) Reconstructing randomized social networks, SDM Conf.,
21.
go back to reference Wu L, Ying X, Wu X (2010) Reconstruction from randomized graph via low rank approximation, SDM Conf., Wu L, Ying X, Wu X (2010) Reconstruction from randomized graph via low rank approximation, SDM Conf.,
22.
go back to reference Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. SDM conference Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. SDM conference
23.
go back to reference Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and \(k\)-degree anonymization schemes for privacy-preserving social network publishing. ACM KDD Conference Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and \(k\)-degree anonymization schemes for privacy-preserving social network publishing. ACM KDD Conference
24.
go back to reference Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. ICDE conference Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. ICDE conference
Metadata
Title
On the anonymizability of graphs
Authors
Charu C. Aggarwal
Yao Li
Philip S. Yu
Publication date
01-12-2015
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 3/2015
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-014-0788-1

Other articles of this Issue 3/2015

Knowledge and Information Systems 3/2015 Go to the issue

Premium Partner