skip to main content
10.1145/1835804.1835933acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Inferring networks of diffusion and influence

Published:25 July 2010Publication History

ABSTRACT

Information diffusion and virus propagation are fundamental processes talking place in networks. While it is often possible to directly observe when nodes become infected, observing individual transmissions (i.e., who infects whom or who influences whom) is typically very difficult. Furthermore, in many applications, the underlying network over which the diffusions and propagations spread is actually unobserved. We tackle these challenges by developing a method for tracing paths of diffusion and influence through networks and inferring the networks over which contagions propagate. Given the times when nodes adopt pieces of information or become infected, we identify the optimal network that best explains the observed infection times. Since the optimization problem is NP-hard to solve exactly, we develop an efficient approximation algorithm that scales to large datasets and in practice gives provably near-optimal performance. We demonstrate the effectiveness of our approach by tracing information cascades in a set of 170 million blogs and news articles over a one year period to infer how information flows through the online media space. We find that the diffusion network of news tends to have a core-periphery structure with a small set of core media sites that diffuse information to the rest of the Web. These sites tend to have stable circles of influence with more general news media sites acting as connectors between them.

Skip Supplemental Material Section

Supplemental Material

kdd2010_gomez_ind_01.mov

mov

100.4 MB

References

  1. Supporting website. http://snap.stanford.edu/netinf.Google ScholarGoogle Scholar
  2. . Adar and L. A. Adamic. Tracking information epidemics in blogspace. In Web Intelligence, pages 207--214, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. .-L. Barabási. The origin of bursts and heavy tails in human dynamics. Nature, 435:207, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  4. .-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286:509--512, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  5. . Clauset, C. Moore, and M. E. J. Newman. Hierarchical structure and the prediction of missing links in networks. Nature, 453(7191):98--101, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  6. . Crane and D. Sornette. Robust dynamic classes revealed by measuring the response function of a social system. Proc. of the National Academy of Sciences, 105(41):15649--15653, October 2008.Google ScholarGoogle ScholarCross RefCross Ref
  7. . Edmonds. Optimum branchings. Journal of Resesearch of the National Bureau of Standards, (71B):233--240, 1967.Google ScholarGoogle Scholar
  8. . Erd\Hos and A. Rényi. On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Science, 5:17--67, 1960.Google ScholarGoogle Scholar
  9. . Friedman and D. Koller. Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Machine Learning, 50(1):95--125, 2003.Google ScholarGoogle Scholar
  10. . Friedman, I. Nachman, and D. Pe'er. Learning Bayesian network structure from massive datasets: The "Sparse Candidate" algorithm. In Proc. UAI, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. . Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins. Information diffusion through blogspace. In WWW '04: Proc. of the 13th international conference on World Wide Web, pages 491--501, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. . Katz and P. Lazarsfeld. Personal influence: The part played by people in the flow of mass communications. Free Press, 1955.Google ScholarGoogle Scholar
  13. . Kempe, J. M. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In KDD '03: Proc. of the 9th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 137--146, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. . Khuller, A. Moss, and J. Naor. The budgeted maximum coverage problem. Information Processing Letters, 70(1):39--45, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. . Knuth. The art of computer programming. Addison-Wesley, 1968.Google ScholarGoogle Scholar
  16. . Kumar, J. Novak, P. Raghavan, and A. Tomkins. Structure and evolution of blogspace. CACM, 47(12):35--39, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. . Leskovec, L. A. Adamic, and B. A. Huberman. The dynamics of viral marketing. In EC '06: Proc. of the 7th ACM conference on Electronic commerce, pages 228--237, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. . Leskovec, L. Backstrom, and J. Kleinberg. Meme-tracking and the dynamics of the news cycle. In KDD '09: Proc. of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 497--506, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. . Leskovec and C. Faloutsos. Scalable modeling of real graphs using kronecker multiplication. In ICML '07: Proc. of the 24th International Conference on Machine Learning, page 504, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. . Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In KDD '05: Proc. of the 11th ACM SIGKDD international conference on Knowledge discovery in data mining, page 187, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. . Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-effective outbreak detection in networks. In KDD '07: Proc. of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 420--429, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. . Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. In WWW '08: Proc. of the 17th International Conference on World Wide Web, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. . Leskovec, M. McGlohon, C. Faloutsos, N. Glance, and M. Hurst. Cascading behavior in large blog graphs. In SDM '07: Proc. of the SIAM Conference on Data Mining, 2007.Google ScholarGoogle Scholar
  24. . Leskovec, A. Singh, and J. M. Kleinberg. Patterns of influence in a recommendation network. In PAKDD '06: Proc. of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 380--389, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. . Liben-Nowell and J. Kleinberg. Tracing the flow of information on a global scale using Internet chain-letter data. Proc. of the National Academy of Sciences, 105(12):4633--4638, 25 Mar. 2008.Google ScholarGoogle ScholarCross RefCross Ref
  26. . D. Malmgren, D. B. Stouffer, A. E. Motter, and L. A. A. N. Amaral. A poissonian explanation for heavy tails in e-mail communication. Proc. of the National Academy of Sciences, 105(47):18153--18158, November 2008.Google ScholarGoogle ScholarCross RefCross Ref
  27. . Nemhauser, L. Wolsey, and M. Fisher. An analysis of approximations for maximizing submodular set functions. Mathematical Programming, 14(1):265--294, 1978.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. . M. Rogers. Diffusion of Innovations. Free Press, New York, fourth edition, 1995.Google ScholarGoogle Scholar
  29. . Wallinga and P. Teunis. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. American Journal of Epidemiology, 160(6):509--516, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  30. . J. Watts and P. S. Dodds. Influentials, networks, and public opinion formation. Journal of Consumer Research, 34(4):441--458, December 2007.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Inferring networks of diffusion and influence

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
      July 2010
      1240 pages
      ISBN:9781450300551
      DOI:10.1145/1835804

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 July 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader