skip to main content
research-article

An efficient reconciliation algorithm for social networks

Published:01 January 2014Publication History
Skip Abstract Section

Abstract

People today typically use multiple online social networks (Facebook, Twitter, Google+, LinkedIn, etc.). Each online network represents a subset of their "real" ego-networks. An interesting and challenging problem is to reconcile these online networks, that is, to identify all the accounts belonging to the same individual. Besides providing a richer understanding of social dynamics, the problem has a number of practical applications. At first sight, this problem appears algorithmically challenging. Fortunately, a small fraction of individuals explicitly link their accounts across multiple networks; our work leverages these connections to identify a very large fraction of the network.

Our main contributions are to mathematically formalize the problem for the first time, and to design a simple, local, and efficient parallel algorithm to solve it. We are able to prove strong theoretical guarantees on the algorithm's performance on well-established network models (Random Graphs, Preferential Attachment). We also experimentally confirm the effectiveness of the algorithm on synthetic and real social network data sets.

References

  1. Dblp. http://dblp.uni-trier.de/xml/.Google ScholarGoogle Scholar
  2. Wikipedia dumps. http://dumps.wikimedia.org/.Google ScholarGoogle Scholar
  3. F. Abel, N. Henze, E. Herder, and D. Krause. Interweaving public user profiles on the web. In UMAP 2010, pages 16--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Backstrom, C. Dwork, and J. Kleinberg. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In WWW 2007, pages 181--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  6. B. Bollobás and O. Riordan. The diameter of a scale-free random graph. Combinatorica, 24(1):5--34, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-mat: A recursive model for graph mining. In SDM 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. E. Cho, S. A. Myers, and J. Leskovec. Friendship and mobility: Friendship and mobility: User movement in location-based social networks. In KDD 2011, pages 1082--1090. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Cooper and A. Frieze. The cover time of the preferential attachment graph. Journal of Combinatorial Theory Series B, 97(2):269--290, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. P. Dubhashi and A. Panconesi. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, 2009. Google ScholarGoogle ScholarCross RefCross Ref
  11. P. Erdős and A. Rényi. On Random Graphs I. Publ. Math. Debrecen, 6: 290--297, 1959.Google ScholarGoogle Scholar
  12. J. Goldenberg, B. Libai, and E. Muller. Talk of the network: Complex systems look at the underlying process of word-of-mouth. Marketing Letters 2001, pages 211--223.Google ScholarGoogle Scholar
  13. M. Granovetter. The strength of weak ties: A network theory revisited. Sociological Theory, 1: 201--233, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  14. K. Henderson, B. Gallagher, L. Li, L. Akoglu, T. Eliassi-Rad, H. Tong, and C. Faloutsos. It's who you know: graph mining using recursive structural features. In KDD 2011, pages 663--671. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Kleinberg. Navigation in a small world. Nature, 406(6798):845, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  16. B. Klimmt and Y. Yang. Introducing the enron corpus. In CEAS conference 2004.Google ScholarGoogle Scholar
  17. S. Labitzke, I. Taranu, and H. Hartenstein. What your friends tell others about you: Low cost linkability of social network profiles. In ACM Social Network Mining and Analysis 2011, pages 51--60.Google ScholarGoogle Scholar
  18. S. Lattanzi. Algorithms and models for social networks. PhD thesis, Sapienza, 2011.Google ScholarGoogle Scholar
  19. S. Lattanzi and D. Sivakumar. Affiliation networks. In STOC 2009, pages 427--434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Malhotra, L. Totti, W. M. Jr., P. Kumaraguru, and V. Almeida. Studying user footprints in different online social networks. In CSOSN 2012, pages 1065--1070. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Műller and D. Stoyan. Comparison Methods for Stochastic Models and Risks. Wiley, 2002.Google ScholarGoogle Scholar
  22. A. Mislove, B. Viswanath, P. K. Gummadi, and P. Druschel. You are who you know: inferring user profiles in online social networks. In WSDM 2010, pages 251--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Narayanan and V. Shmatikov. De-anonymizing social networks. In S&P (Oakland) 2009, pages 111--125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Novak, P. Raghavan, and A. Tomkins. Anti-aliasing on the web. In WWW 2004, pages 30--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Nunes, P. Calado, and B. Martins. Resolving user identities over social networks through supervised learning and rich similarity features. In SAC 2012, pages 728--729. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. Pedarsani and M. Grossglauser. On the privacy of anonymized networks. In KDD 2011, pages 1235--1243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. R. Rao and P. Rohatgi. Can pseudonymity really guarantee privacy? In USENIX 2000, pages 85--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Rowe and F. Ciravegna. Harnessing the social web: The science of identity disambiguation. In Web Science Conference 2010.Google ScholarGoogle Scholar
  29. G. Schoenebeck. Potential networks, contagious communities, and understanding social network structure. In WWW 2013, pages 1123--1132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. B. Viswanath, A. Mislove, M. Cha, and K. P. Gummadi. On the evolution of user interaction in facebook. In WOSN 2009, pages 37--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. L. Yartseva and M. Grossglauser. On the performance of percolation graph matching. In COSN 2013, pages 119--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Yu, M. Kaminsky, P. B. Gibbons, and A. D. Flaxman. Sybilguard: defending against sybil attacks via social networks. IEEE/ACM Trans. Netw. 16(3), pages 267--278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. Zafarani and H. Liu. Connecting corresponding identities across communities. In ICWSM 2009, pages 354--357.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 7, Issue 5
    Janary 2014
    100 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 January 2014
    Published in pvldb Volume 7, Issue 5

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader