skip to main content
10.1145/1772690.1772778acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Measurement-calibrated graph models for social network experiments

Authors Info & Claims
Published:26 April 2010Publication History

ABSTRACT

Access to realistic, complex graph datasets is critical to research on social networking systems and applications. Simulations on graph data provide critical evaluation of new systems and applications ranging from community detection to spam filtering and social web search. Due to the high time and resource costs of gathering real graph datasets through direct measurements, researchers are anonymizing and sharing a small number of valuable datasets with the community. However, performing experiments using shared real datasets faces three key disadvantages: concerns that graphs can be de-anonymized to reveal private information, increasing costs of distributing large datasets, and that a small number of available social graphs limits the statistical confidence in the results.

The use of measurement-calibrated graph models is an attractive alternative to sharing datasets. Researchers can "fit" a graph model to a real social graph, extract a set of model parameters, and use them to generate multiple synthetic graphs statistically similar to the original graph. While numerous graph models have been proposed, it is unclear if they can produce synthetic graphs that accurately match the properties of the original graphs. In this paper, we explore the feasibility of measurement-calibrated synthetic graphs using six popular graph models and a variety of real social graphs gathered from the Facebook social network ranging from 30,000 to 3 million edges. We find that two models consistently produce synthetic graphs with common graph metric values similar to those of the original graphs. However, only one produces high fidelity results in our application-level benchmarks. While this shows that graph models can produce realistic synthetic graphs, it also highlights the fact that current graph metrics remain incomplete, and some applications expose graph properties that do not map to existing metrics.

References

  1. Monte Carlo Statistical Methods. New York: Springer-Verlag, 2004.Google ScholarGoogle Scholar
  2. ADAMIC, L. A., BUYUKKOKTEN, O., AND ADAR, E. A social network caught in the web. First Monday 8, 6 (2003).Google ScholarGoogle ScholarCross RefCross Ref
  3. AHN, Y.-Y., ET AL. Analysis of topological characteristics of huge online social networking services. In Proc. of WWW (May 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. BACKSTROM, L., DWORK, C., AND KLEINBERG, J. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In WWW (May 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. BACKSTROM, L., DWORK, C., AND KLEINBERG, J. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In WWW (May 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. BARAKAT, C., ET AL. A flow-based model for internet backbone traffic. In Proc. of Internet Measurement Workshop (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. BARBARO, M., AND ZELLER, T. A face is exposed for AOL searcher no. 4417749, August 2006. NY Times.Google ScholarGoogle Scholar
  8. BLUM, A., CHAN, T.-H. H., AND RWEBANGIRA, M. R. A random-surfer web-graph model. In Proc. of ANALCO (2006).Google ScholarGoogle ScholarCross RefCross Ref
  9. CLAUSET, A., SHALIZI, C. R., AND NEWMAN,M. E. J. Power-law distributions in empirical data. SIAM Review 51, 4 (2009), 661--703. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. DOUCEUR, J. R. The Sybil attack. In Proc. of IPTPS (March 2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. ERDOS, P., AND RENYI, A. On the evolution of random graphs. Mathematical Institute of the Hungarian Acadamy of Science (1960).Google ScholarGoogle Scholar
  12. FLOYD, S., AND KOHLER, E. Internet research needs better models. In Proc. of HotNets (Oct. 2002).Google ScholarGoogle Scholar
  13. GARRISS, S., ET AL. Re: Reliable email. In Proc. of NSDI (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. IVONA, B., ADAM, K., AND RAHUL, S. Graph model selection using maximum likelihood. In Proc. of ICML (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. KIM, M., KOTZ, D., AND KIM, S. Extracting a mobility model from real user traces. In Proc. of INFOCOM (April 2006).Google ScholarGoogle ScholarCross RefCross Ref
  16. KONRAD, A., ZHAO, B. Y., JOSEPH, A. D., AND LUDWIG, R. A markov-based channel model algorithm for wireless networks. ACM Wireless Networks 9, 3 (May 2003), 189--199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. KUMAR, R., ET AL. Stochastic models for the web graph. In Proc. of FOCS (2000). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. KUMAR, R., NOVAK, J., AND TOMKINS, A. Structure and evolution of online social networks. In Proc. of ACM KDD (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. LESKOVEC, J., AND FALOUTSOS, C. Scalable modeling of real graphs using kronecker multiplication. In Proc. of ICML (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. LESKOVEC, J., KLEINBERG, J., AND FALOUTSOS, C. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proc. of ACM KDD (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. LI, L., ET AL. Towards a theory of scale-free graphs: Definition, properties, and implications. Internet Math 2, 4 (2005), 431--523.Google ScholarGoogle ScholarCross RefCross Ref
  22. LUDWIG, R., KONRAD, A., AND JOSEPH, A. Optimizing the end-to-end performance of reliable flows over wireless link. In Proc. of MobiCom (Seattle, WA, 1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. MAHADEVAN, P., ET AL. Systematic topology analysis and generation using degree correlations. In Proc. of SIGCOMM (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. MESSMER, B., AND BUNKE, H. A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Trans. Pattern Analysis and Machine Intelligence 20 (1998), 493--504. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. MISLOVE, A., ET AL. Measurement and analysis of online social networks. In Proc. of IMC (San Diego, CA, Oct 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. MISLOVE, A., ET AL. Growth of the flickr social network. In Proc. of WOSN (Seattle, WA, August 2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. NARAYANAN, A., AND SHMATIKOV, V. How to break anonymity of the netflix prize dataset. In Proc. of IEEE S&P (May 2008).Google ScholarGoogle Scholar
  28. NARAYANAN, A., AND SHMATIKOV, V. De-anonymizing social networks. In Proc. of IEEE S&P (May 2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. NEWMAN, M. E. J. Mixing patterns in networks. Physical Review E 67-026126 (2003).Google ScholarGoogle Scholar
  30. PUTTASWAMY, K. P. N., SALA, A., AND ZHAO, B. Y. Improving anonymity using social links. In Proc. of NPSec (October 2008).Google ScholarGoogle ScholarCross RefCross Ref
  31. RUSSELL, S., AND NORVIG, P. Artificial Intelligence: A Modern Approach, second ed. Prentice Hall, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. TOIVONEN, R., ET AL. A model for social networks. Physica A: Statistical and Theoretical Physics 371, 2 (Nov. 2006), 851--860.Google ScholarGoogle ScholarCross RefCross Ref
  33. VAZQUEZ, A. Growing network with local rules: Preferential attachment, clustering hierarchy, and degree correlations. Physical Review E 67-056104 (2003).Google ScholarGoogle Scholar
  34. WATTS, D. J., AND STROGATZ, S. Collective dynamics of 'small-world' networks. Nature, 393 (1998), 440--442.Google ScholarGoogle ScholarCross RefCross Ref
  35. WILSON, C., BOE, B., SALA, A., PUTTASWAMY, K. P. N., AND ZHAO, B. Y. User interactions in social networks and their implications. In Proc. of EuroSys (April 2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. YU, H., ET AL. Sybilguard: defending against sybil attacks via social networks. In Proc. of SIGCOMM (September 2006). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Measurement-calibrated graph models for social network experiments

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          WWW '10: Proceedings of the 19th international conference on World wide web
          April 2010
          1407 pages
          ISBN:9781605587998
          DOI:10.1145/1772690

          Copyright © 2010 International World Wide Web Conference Committee (IW3C2)

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 April 2010

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,899of8,196submissions,23%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        ePub

        View this article in ePub.

        View ePub