skip to main content
10.1145/1281192.1281239acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Cost-effective outbreak detection in networks

Published:12 August 2007Publication History

ABSTRACT

Given a water distribution network, where should we place sensors toquickly detect contaminants? Or, which blogs should we read to avoid missing important stories?.

These seemingly different problems share common structure: Outbreak detection can be modeled as selecting nodes (sensor locations, blogs) in a network, in order to detect the spreading of a virus or information asquickly as possible. We present a general methodology for near optimal sensor placement in these and related problems. We demonstrate that many realistic outbreak detection objectives (e.g., detection likelihood, population affected) exhibit the property of "submodularity". We exploit submodularity to develop an efficient algorithm that scales to large problems, achieving near optimal placements, while being 700 times faster than a simple greedy algorithm. We also derive online bounds on the quality of the placements obtained by any algorithm. Our algorithms and bounds also handle cases where nodes (sensor locations, blogs) have different costs.

We evaluate our approach on several large real-world problems,including a model of a water distribution network from the EPA, andreal blog data. The obtained sensor placements are provably near optimal, providing a constant fraction of the optimal solution. We show that the approach scales, achieving speedups and savings in storage of several orders of magnitude. We also show how the approach leads to deeper insights in both applications, answering multicriteria trade-off, cost-sensitivity and generalization questions.

Skip Supplemental Material Section

Supplemental Material

p420-leskovec-200.mov

mov

30.3 MB

p420-leskovec-768.mov

mov

105.8 MB

References

  1. N. Bailey. The Mathematical Theory of Infectious Diseases and its Applications. Griffin, London, 1975.Google ScholarGoogle Scholar
  2. J. Berry, W. E. Hart, C. E. Phillips, J. G. Uber, and J. Watson. Sensor placement in municipal water networks with temporal integer programming models. J. Water Resources Planning and Management, 2006.Google ScholarGoogle Scholar
  3. S. Bikhchandani, D. Hirshleifer, and I. Welch. A theory of fads, fashion, custom, and cultural change as informational cascades. J. of Polit. Econ., (5), 1992.Google ScholarGoogle ScholarCross RefCross Ref
  4. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge UP, March 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Cohen, S. Havlin, and D. ben Avraham. Efficient immunization strategies for computer networks and populations. Physical Review Letters, 91:247901, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  6. G. Dorini, P. Jonkergouw, and et. al. An efficient algorithm for sensor placement in water distribution systems. In at. Dist. Syst. An. Conf., 2006.Google ScholarGoogle Scholar
  7. N. S. Glance, M. Hurst, K. Nigam, M. Siegler, R. Stockton, and T. Tomokiyo. Deriving marketing intelligence from online discussion. In KDD, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Goldenberg, B. Libai, and E. Muller. Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Letters, 12, 2001.Google ScholarGoogle Scholar
  9. D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins. Information diffusion through blogspace. WWW'04. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of in?uence through a social network. In KDD, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Khuller, A. Moss, and J. Naor. The budgeted maximum coverage problem. Inf. Proc. Let., 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Krause, J. Leskovec, C. Guestrin, J. VanBriesen, and C. Faloutsos. Efficient sensor placement optimization for securing large water distribution networks. Submitted to the J. of Water Resources Planning an Management, 2007.Google ScholarGoogle Scholar
  13. R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. On the bursty evolution of blogspace. In WWW, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Leskovec, L. A. Adamic, and B. A. Huberman. The dynamics of viral marketing. In ACM EC, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-effective Outbreak Detection in Networks. TR, CMU-ML-07-111, 2007.Google ScholarGoogle Scholar
  16. J. Leskovec, M. McGlohon, C. Faloutsos, N. S. Glance, and M. Hurst. Cascading behavior in large blog graphs. In SDM, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  17. G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14, 1978.Google ScholarGoogle Scholar
  18. A. Ostfeld and E. Salomons. Optimal layout of early warning detection stations for water distribution systems security. J. Water Resources Planning and Management, 130(5):377--385, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  19. A. Ostfeld, J. G. Uber, and E. Salomons. Battle of water sensor networks: A design challenge for engineers and algorithms. In WDSA, 2006.Google ScholarGoogle Scholar
  20. R. Pastor-Satorras and A. Vespignani. Immunization of complex networks. Physical Review E, 65, 2002.Google ScholarGoogle Scholar
  21. M. Richardson and P. Domingos. Mining knowledge-sharing sites for viral marketing. KDD '02. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. G. Robertazzi and S. C. Schwartz. An accelerated sequential algorithm for producing D-optimal designs. SIAM J.Sci.Stat.Comp., 10(2):341--358, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. E. Rogers. Diffusion of innovations. Free Press, 1995.Google ScholarGoogle Scholar
  24. L. A. Rossman. The epanet programmer's toolkit for analysis of water distribution systems. In Ann. Wat. Res. Plan. Mgmt. Conference, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  25. M. Sviridenko. A note on maximizing a submodular set function subject to knapsack constraint. Operations Research Letters, 32:41--43, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cost-effective outbreak detection in networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2007
      1080 pages
      ISBN:9781595936097
      DOI:10.1145/1281192

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 August 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      KDD '07 Paper Acceptance Rate111of573submissions,19%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader