skip to main content
10.1145/2020408.2020569acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
poster

Discovering highly reliable subgraphs in uncertain graphs

Published:21 August 2011Publication History

ABSTRACT

In this paper, we investigate the highly reliable subgraph problem, which arises in the context of uncertain graphs. This problem attempts to identify all induced subgraphs for which the probability of connectivity being maintained under uncertainty is higher than a given threshold. This problem arises in a wide range of network applications, such as protein-complex discovery, network routing, and social network analysis. Since exact discovery may be computationally intractable, we introduce a novel sampling scheme which enables approximate discovery of highly reliable subgraphs with high probability. Furthermore, we transform the core mining task into a new frequent cohesive set problem in deterministic graphs. Such transformation enables the development of an efficient two-stage approach which combines novel peeling techniques for maximal set discovery with depth-first search for further enumeration. We demonstrate the effectiveness and efficiency of the proposed algorithms on real and synthetic data sets.

References

  1. E. Adar and C. Re. Managing uncertainty in social networks. IEEE Data Eng. Bull., 30(2):15--22, 2007.Google ScholarGoogle Scholar
  2. C. C. Aggarwal, editor. Managing and Mining Uncertain Data. Advances in Database Systems. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Asthana, O. D. King, F. D. Gibbons, and F. P. Roth. Predicting protein complex membership using probabilistic network reliability. Genome Res, 14(6):1170--1175, June 2004.Google ScholarGoogle ScholarCross RefCross Ref
  4. J. S. Bader, A. Chaudhuri, J. M. Rothberg, and J. Chant. Gaining confidence in high-throughput protein interaction networks. Nature Biotechnology, 22(1):78--85, December 2003.Google ScholarGoogle ScholarCross RefCross Ref
  5. M. O. Ball. Computational complexity of network reliability analysis: An overview. IEEE Transactions on Reliability, 35:230--239, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  6. Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1):289--300, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  7. Y. Benjamini and D. Yekutieli. The Control of the False Discovery Rate in Multiple Testing under Dependency. The Annals of Statistics, 29(4):1165--1188, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  8. C. Chen, X. Yan, F. Zhu, and J. Han. gapprox: Mining frequent approximate patterns from a massive network. In ICDM, pages 445--450, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. The Annals of Mathematical Statistics, 23(4):493--507, 1952.Google ScholarGoogle ScholarCross RefCross Ref
  10. C. J. Colbourn. The Combinatorics of Network Reliability. Oxford University Press, Inc., 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Ghosh, H. Q. Ngo, S. Yoon, and C. Qiao. On a Routing Problem Within Probabilistic Graphs and its Application to Intermittently Connected Networks. In INFOCOM'07, pages 1721--1729, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Guha, R. Kumar, P. Raghavan, and A. Tomkins. Propagation of trust and distrust. In WWW'04, pages 403--412, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Han, H. Cheng, D. Xin, and X. Yan. Frequent Pattern Mining: Current Status and Future Directions. Data Mining and Knowledge Discovery, 14(1), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Han, M. Kamber, and J. Pei. Data Mining: Concepts and Techniques, Second Edition (The Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann, 2nd edition, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Hanhijarvi, K. Puolamaki, and G. C. Garriga. Multiple hypothesis testing in pattern discovery, 2009. arXiv:0906.5263v1 {stat.ML}.Google ScholarGoogle Scholar
  16. P. Hintsanen. The most reliable subgraph problem. In PKDD, pages 471--478, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Hintsanen and H. Toivonen. Finding reliable subgraphs from large probabilistic graphs. Data Min. Knowl. Discov., 17(1):3--23, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Hintsanen, H. Toivonen, and P. Sevon. Fast discovery of reliable subnetworks. In ASONAM, pages 104--111, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13--30, 1963.Google ScholarGoogle ScholarCross RefCross Ref
  20. J. Hopcroft and R. Tarjan. Algorithm 447: efficient algorithms for graph manipulation. Commun. ACM, 16:372--378, June 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Jiang, Z. Tu, T. Chen, and F. Sun. Network motif identification in stochastic networks. PNAS, 103(25):9404--9409, June 2006.Google ScholarGoogle ScholarCross RefCross Ref
  22. R. Jin, L. Liu, B. Ding, and H. Wang. Distance-constraint reachability computation in uncertain graphs. In Proceedings of the VLDB Endowment, volume 4, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Jin, S. McCallen, and E. Almaas. Trend motif: A graph mining approach for analysis of dynamic complex networks. In ICDM, pages 541--546, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Karrer and M. E. J. Newman. Stochastic blockmodels and community structure in networks. Phys. Rev. E, 83(1):016107, Jan 2011.Google ScholarGoogle ScholarCross RefCross Ref
  25. M. Kasari, H. Toivonen, and P. Hintsanen. Fast discovery of reliable k-terminal subgraphs. In PAKDD (2), pages 168--177, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Kempe, J. M. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. In KDD, pages 137--146, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Kirsch, M. Mitzenmacher, A. Pietracaprina, G. Pucci, E. Upfal,and F. Vandin. An efficient rigorous approach for identifying statistically significant frequent itemsets. In PODS'09, pages 117--126, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. V. E. Lee, N. Ruan, R. Jin, and C. Aggarwal. A survey of algorithms for dense subgraph discovery. In Charu C. Aggarwal and Haixun Wang, editors, Managing and Mining Graph Data, pages 303--336. Springer US, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  29. V. Manfredi, R. Hancock, and J. Kurose. Robust routing in dynamic manets. In Annual Conference of the International Technology Alliance (ACITA), 2008.Google ScholarGoogle Scholar
  30. M. Potamias, F. Bonchi, A. Gionis, and G. Kollios. k-nearest neighbors in uncertain graphs. PVLDB, 3(1):997--1008, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. G. Rubino. Network reliability evaluation. In Network performance modeling and simulation, pages 275--302. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Stoer and F. Wagner. A simple min-cut algorithm. J. ACM, 44:585--591, July 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. G. Swamynathan, C. Wilson, B. Boe, K. Almeroth, and B. Y. Zhao. Do social networks improve e-commerce?: a study on social marketplaces. In WOSP '08: Proceedings of the first workshop on Online social networks, pages 1--6, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. L. G. Valiant. The complexity of enumeration and reliability problems. SIAM Journal on Computing, 8(3):410--421, 1979.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D. R. White and F. Harary. The cohesiveness of blocks in social networks: Node connectivity and conditional density. Sociological Methodology, 31:305--359, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  36. X. Yan, X. J. Zhou, and J. Han. Mining closed relational graphs with connectivity constraints. In KDD '05, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. D. M. Yellin. An algorithm for dynamic subset and intersection testing. Theoretical Computer Science, 129(2):397--406, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Y. Yuan, L. Chen, and G. Wang. Efficiently answering probability threshold-based shortest path queries over uncertain graphs. In DASFAA, pages 155--170, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Z. Zou, H. Gao, and J. Li. Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. In KDD, pages 633--642, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Z. Zou, J. Li, H. Gao, and S. Zhang. Finding top-k maximal cliques in an uncertain graph. In ICDE, pages 649--652, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  41. Z. Zou, J. Li, H. Gao, and S. Zhang. Mining frequent subgraph patterns from uncertain graph data. IEEE Trans. on Knowl. and Data Eng., 22(9), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Discovering highly reliable subgraphs in uncertain graphs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2011
      1446 pages
      ISBN:9781450308137
      DOI:10.1145/2020408

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 August 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader