skip to main content
10.1145/2020408.2020596acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
poster

On the privacy of anonymized networks

Published:21 August 2011Publication History

ABSTRACT

The proliferation of online social networks, and the concomitant accumulation of user data, give rise to hotly debated issues of privacy, security, and control. One specific challenge is the sharing or public release of anonymized data without accidentally leaking personally identifiable information (PII). Unfortunately, it is often difficult to ascertain that sophisticated statistical techniques, potentially employing additional external data sources, are unable to break anonymity. In this paper, we consider an instance of this problem, where the object of interest is the structure of a social network, i.e., a graph describing users and their links. Recent work demonstrates that anonymizing node identities may not be sufficient to keep the network private: the availability of node and link data from another domain, which is correlated with the anonymized network, has been used to re-identify the anonymized nodes. This paper is about conditions under which such a de-anonymization process is possible.

We attempt to shed light on the following question: can we assume that a sufficiently sparse network is inherently anonymous, in the sense that even with unlimited computational power, de-anonymization is impossible? Our approach is to introduce a random graph model for a version of the de-anonymization problem, which is parameterized by the expected node degree and a similarity parameter that controls the correlation between two graphs over the same vertex set. We find simple conditions on these parameters delineating the boundary of privacy, and show that the mean node degree need only grow slightly faster than log n with network size n for nodes to be identifiable. Our results have policy implications for sharing of anonymized network information.

References

  1. AOL Search Data Scandal. http://en.wikipedia.org/wiki/\\AOL_search_data_scandal.Google ScholarGoogle Scholar
  2. W. Aiello, F. Chung, and L. Lu. A Random Graph Model for Massive Graphs. In STOC '00, pages 171--180, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Backstrom, C. Dwork, and J. Kleinberg. Wherefore Art Thou R3579X?: Anonymized Social Networks, Hidden Patterns, and Structural Steganography. In WWW '07, pages 181--190, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang. Complex Networks: Structure and Dynamics. Physics Reports, 424(4--5):175--308, 2006.Google ScholarGoogle Scholar
  5. B. Bollobas. Random Graphs (2nd edition). Cambridge University Press, 2001.Google ScholarGoogle Scholar
  6. A. Chaintreau, P. Hui, J. Crowcroft, C. Diot, R. Gass, and J. Scott. Impact of Human Mobility on Opportunistic Forwarding Algorithms. IEEE Transactions on Mobile Computing, 6:606--620, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Gori, M. Maggini, and L. Sarti. Exact and Approximate Graph Matching Using Random Walks. IEEE Trans. Pattern Anal. Mach. Intell., 27(7):1100--1111, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Gross and A. Acquisti. Information Revelation and Privacy in Online Social Networks. In WPES '05: Proceedings of the 2005 ACM workshop on Privacy in the electronic society, pages 71--80, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Guha, K. Tang, and P. Francis. NOYB: Privacy in Online Social Networks. In WOSN'08: Workshop on Online Social Networks, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Hay, G. Miklau, D. Jensen, D. Towsley, and C. Li. Resisting Structural Re-Identification in Anonymized Networks. VLDB Journal, 19(6), December 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. R. Henzinger, A. Heydon, M. Mitzenmacher, and M. Najork. On Near-Uniform URL Sampling. In Proceedings of the 9th international World Wide Web conference on Computer networks, pages 295--308, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Hill, F. Provost, and C. Volinsky. Network-based Marketing: Identifying Likely Adopters via Consumer Networks. Statistical Science, 21:256--276, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  13. S. Janson, T. Łuczak, and A. Rucinski. Random Graphs. Wiley, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  14. B. Karrer and M. E. J. Newman. Random Graph Models for Directed Acyclic Networks. Physical review. E, 2009.Google ScholarGoogle Scholar
  15. A. Korolova, R. Motwani, S. U. Nabar, and Y. Xu. Link Privacy in Social Networks. In CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge management, pages 289--298, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Kossinets, J. Kleinberg, and D. Watts. The Structure of Information Pathways in a Social Communication Network. In KDD '08, pages 435--443, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. B. Krishnamurthy and C. Willis. Characterizing Privacy in Online Social Networks. In WOSN'08: Workshop on Online Social Networks, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Leskovec, L. Backstrom, and J. Kleinberg. Meme-Tracking and the Dynamics of the News Cycle. In KDD '09, pages 497--506, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Leskovec, D. Chakrabarti, J. Kleinberg, C. Faloutsos, and Z. Ghahramani. Kronecker Graphs: An Approach to Modeling Networks. J. Mach. Learn. Res., 11:985--1042, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Leskovec, D. Huttenlocher, and J. Kleinberg. Predicting Positive and Negative Links in Online Social Networks. In WWW '10, pages 641--650, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In KDD '05, pages 177--187, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Luo and E. R. Hancock. Structural Graph Matching Using the EM Algorithm and Singular Value Decomposition. IEEE Trans. Pattern Anal. Mach. Intell., 23(10):1120--1136, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Martin and A. Schulman. Deanonymizing Users of the SafeWeb Anonymizing Service. In Proceedings of the 11th USENIX Security Symposium, pages 123--137, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Mislove, H. S. Koppula, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Growth of the Flickr Social Network. In WOSP '08: Proceedings of the first workshop on Online social networks, pages 25--30, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and Analysis of Online Social Networks. In IMC '07: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pages 29--42, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Myers, R. C. Wilson, and E. R. Hancock. Bayesian Graph Edit Distance. IEEE Trans. Pattern Anal. Mach. Intell., 22(6), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Narayanan and V. Shmatikov. Robust De-anonymization of Large Sparse Datasets. In SP '08: Proceedings of the 2008 IEEE Symposium on Security and Privacy, pages 111--125, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Narayanan and V. Shmatikov. De-anonymizing Social Networks. Security and Privacy, IEEE Symposium on, 0:173--187, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. E. J. Newman, D. J. Watts, and S. H. Strogatz. Random graph models of social networks. Proceedings of the National Academy of Sciences of the United States of America, 99(Suppl 1):2566--2572, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  30. L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs. IEEE Trans. Pattern Anal. Mach. Intell., 26(10):1367--1372, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. P. Pedarsani, D. R. Figueiredo, and M. Grossglauser. Densification Arising from Sampling Fixed Graphs. In SIGMETRICS '08, pages 205--216, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Y. Richter, E. Yom-Tov, and N. Slonim. Predicting Customer Churn in Mobile Networks through Analysis of Social Groups. In Proc. SIAM Int'l Conference on Data Mining (SDM) 2010, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  33. J. Riordan. An Introduction to Combinatorial Analysis. Wiley, 1958.Google ScholarGoogle Scholar
  34. A. Sanfeliu and K. Fu. A Distance Measure between Attributed Relational Graphs for Pattern Recognition. IEEE Transactions On Systems, Man, and Cybernetics, 13(3):353--362, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  35. D. Stutzbach, R. Rejaie, N. Duffield, S. Sen, and W. Willinger. On Unbiased Sampling for Unstructured Peer-to-Peer Networks. IEEE/ACM Trans. Netw., 17(2):377--390, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Y. Tian and J. M. Patel. TALE: A Tool for Approximate Large Graph Matching. Data Engineering, International Conference on, 0:963--972, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. R. Ullmann. An Algorithm for Subgraph Isomorphism. J. ACM, 23(1):31--42, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. G. Wondracek, T. Holz, E. Kirda, and C. Kruegel. A Practical Attack to De-Anonymize Social Network Users. In IEEE Symposium on Security & Privacy, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. B. Zhou and J. Pei. Preserving Privacy in Social Networks Against Neighborhood Attacks. In ICDE '08: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pages 506--515, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. On the privacy of anonymized networks

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
              August 2011
              1446 pages
              ISBN:9781450308137
              DOI:10.1145/2020408

              Copyright © 2011 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 21 August 2011

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • poster

              Acceptance Rates

              Overall Acceptance Rate1,133of8,635submissions,13%

              Upcoming Conference

              KDD '24

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader