Abstract
We identify privacy risks associated with releasing network data sets and provide an algorithm that mitigates those risks. A network consists of entities connected by links representing relations such as friendship, communication, or shared activity. Maintaining privacy when publishing networked data is uniquely challenging because an individual's network context can be used to identify them even if other identifying information is removed. In this paper, we quantify the privacy risks associated with three classes of attacks on the privacy of individuals in networks, based on the knowledge used by the adversary. We show that the risks of these attacks vary greatly based on network structure and size. We propose a novel approach to anonymizing network data that models aggregate network structure and then allows samples to be drawn from that model. The approach guarantees anonymity for network entities while preserving the ability to estimate a wide variety of network measures with relatively little bias.
- W. Aiello, F. Chung, and L. Lu. A random graph model for massive graphs. In STOC, 2000. Google ScholarDigital Library
- R. Albert, H. Jeong, and A.-L. Barabasi. Error and attack tolerance of complex networks. Nature, 406:378, 2000.Google ScholarCross Ref
- L. Babai and L. Kucera. Canonical labeling of graphs in linear average time. In FOCS, 1979. Google ScholarDigital Library
- L. Backstrom, C. Dwork, and J. Kleinberg. Wherefore art thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. In WWW, 2007. Google ScholarDigital Library
- A. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286, 1999.Google Scholar
- G. Bianconi and M. Marsili. Emergence of large cliques in random scale-free networks. Europhysics Letters, 74(4):740--746, 2006.Google ScholarCross Ref
- W. W. Cohen. Enron email dataset, 2005.Google Scholar
- D. G. Corneil and C. C. Gotlieb. An efficient algorithm for graph isomorphism. J. ACM, 17(1):51--64, 1970. Google ScholarDigital Library
- C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, 2006. Google ScholarDigital Library
- P. Erdős and A. Rényi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci, 5:17--61, 1960.Google Scholar
- N. Friedkin. Horizons of observability and limits of informal control in organizations. Social Forces, 62(1):54--77, 1983.Google ScholarCross Ref
- K. Frikken and P. Golle. Private social network analysis: How to assemble pieces of a graph privately. In WPES, 2006. Google ScholarDigital Library
- J. Grochow and M. Kellis. Network motif discovery using subgraph enumeration and symmetry-breaking. In RECOMB, 2007. Google ScholarDigital Library
- M. Hay, G. Miklau, D. Jensen, P. Weis, and S. Srivastava. Anonymizing social networks. Technical Report 07--19, UMass Amherst, 2007.Google Scholar
- N. Immerman and E. Lander. Describing graphs: A first-order approach to graph canonization. In Complexity Theory Retrospective. Springer-Verlag, 1990.Google Scholar
- A. Korolova, R. Motwani, S. Nabar, and Y. Xu. Link privacy in social networks. In ICDE, 2008. Google ScholarDigital Library
- K. Liu and E. Terzi. Towards identity anonymization on graphs. In SIGMOD, 2008. Google ScholarDigital Library
- A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. l-diversity: privacy beyond k-anonymity. ICDE, 2006. Google ScholarDigital Library
- D. Martin, D. Kifer, A. Machanavajjhala, J. Gehrke, and J. Halpern. Worst-case background knowledge. ICDE, 2007.Google Scholar
- M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45(2):167--256, 2003.Google ScholarDigital Library
- J. J. Potterat, L. Phillips-Plummer, S. Q. Muth, R. B. Rothenberg, D. E. Woodhouse, T. S. Maldonado-Long, H. P. Zimmerman, and J. B. Muth. Risk network structure in the early epidemic phase of HIV transmission in Colorado Springs. Sexually Trans. Infections, 2002.Google ScholarCross Ref
- V. Rastogi, S. Hong, and D. Suciu. The boundary between privacy and utility in data publishing. In VLDB, 2007. Google ScholarDigital Library
- S. Russell and P. Norvig. AI: A Modern Approach. 2003.Google Scholar
- L. Singh and J. Zhan. Measuring topological anonymity in social networks. In Intl. Conf. on Granular Computing, 2007. Google ScholarDigital Library
- L. Sweeney, k-anonymity: a model for protecting privacy. Journ. of Uncertainty, Fuzziness, and KB Systems, 2002. Google ScholarDigital Library
- D.-W. Wang, C.-J. Liau, and T.-S. Hsu. Privacy protection in social network data disclosure based on granular computing. In International Conference on Fuzzy Systems, 2006.Google Scholar
- D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393:440--442, 1998.Google ScholarCross Ref
- D. B. West. Introduction to Graph Theory. August 2000.Google Scholar
- X. Ying and X. Wu. Randomizing social networks: a spectrum preserving approach. In SIAM Conf. on Data Mining, 2007.Google Scholar
- E. Zheleva and L. Getoor. Preserving the privacy of sensitive relationships in graph data. In PinKDD Workshop, 2007. Google ScholarDigital Library
- B. Zhou and J. Pei. Preserving privacy in social networks against neighborhood attacks. In ICDE, 2008. Google ScholarDigital Library
Index Terms
- Resisting structural re-identification in anonymized social networks
Recommendations
Resisting structural re-identification in anonymized social networks
We identify privacy risks associated with releasing network datasets and provide an algorithm that mitigates those risks. A network dataset is a graph representing entities connected by edges representing relations such as friendship, communication or ...
Structural Diversity for Resisting Community Identification in Published Social Networks
As an increasing number of social networking data is published and shared for commercial and research purposes, privacy issues about the individuals in social networks have become serious concerns. Vertex identification, which identifies a particular ...
Preservation of structural properties in anonymized social networks
COLLABORATECOM '12: Proceedings of the 2012 8th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2012)Social networks such as Facebook, LinkedIn, or Twitter have nowadays a global reach that surpassed all previous expectations. Many social networks gather confidential information of their users, and as a result, the privacy in social networks has become ...
Comments