skip to main content
10.1145/2588555.2593668acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

The pursuit of a good possible world: extracting representative instances of uncertain graphs

Published:18 June 2014Publication History

ABSTRACT

Data in several applications can be represented as an uncertain graph, whose edges are labeled with a probability of existence. Exact query processing on uncertain graphs is prohibitive for most applications, as it involves evaluation over an exponential number of instantiations. Even approximate processing based on sampling is usually extremely expensive since it requires a vast number of samples to achieve reasonable quality guarantees. To overcome these problems, we propose algorithms for creating deterministic representative instances of uncertain graphs that maintain the underlying graph properties. Specifically, our algorithms aim at preserving the expected vertex degrees because they capture well the graph topology. Conventional processing techniques can then be applied on these instances to closely approximate the result on the uncertain graph. We experimentally demonstrate, with real and synthetic uncertain graphs, that indeed the representative instances can be used to answer, efficiently and accurately, queries based on several properties such as shortest path distance, clustering coefficient and betweenness centrality.

References

  1. S. Abiteboul, P. Kanellakis, and G. Grahne. On the representation and querying of sets of possible worlds. In SIGMOD, pages 34--48, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. E. Adar and C. Re. Managing uncertainty in social networks. IEEE Data Eng. Bull., 30(2):15--22, 2007.Google ScholarGoogle Scholar
  3. C. C. Aggarwal and P. S. Yu. A survey of uncertain data algorithms and applications. TKDE, 21(5):609--623, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Asthana, O. D. King, F. D. Gibbons, and F. P. Roth. Predicting protein complex membership using probabilistic network reliability. Genome Res., 14:1170--1175, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  5. J. Blitzstein and P. Diaconis. A sequential importance sampling algorithm for generating random graphs with prescribed degrees. Internet Mathematics, 6(4):489--522, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  6. P. Boldi, F. Bonchi, A. Gionis, and T. Tassa. Injecting uncertainty in graphs for identity obfuscation. PVLDB, 5(11):1376--1387, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. Chen and C. Wang. Continuous subgraph pattern search over certain and uncertain graph streams. TKDE, 22(8):1093--1109, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. F. R. K. Chung and L. Lu. The average distance in a random graph with given expected degrees. Internet Mathematics, 1(1):91--113, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  9. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. In VLDB, pages 864--875, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Erdös and T. Gallai. Graphs with prescribed degrees of vertices (hungarian). Mat. Lapok, 11:264--274, 1960.Google ScholarGoogle Scholar
  11. P. Fraigniaud. Small worlds as navigable augmented networks: model, analysis, and validation. In ESA, pages 2--11, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. M. M. Gonzalez, B. Dalsgaard, and J. M. Olesen. Centrality measures and the importance of generalist species in pollination networks. Ecological Complexity, 7(1):36--43, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  13. S. L. Hakimi. On realizability of a set of integers as degrees of the vertices of a linear graph (i). J. Soc. Indust. Appl. Math., 10(3):496--506, 1962.Google ScholarGoogle ScholarCross RefCross Ref
  14. S. Hougardy. Linear time approximation algorithms for degree constrained subgraph problems. In W. Cook, L. Lovasz, and J. Vygen, editors, Research Trends in Combinatorial Optimization, pages 185--200. Springer, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  15. R. Jin, L. Liu, and C. C. Aggarwal. Discovering highly reliable subgraphs in uncertain graphs. In KDD, pages 992--1000, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Jin, L. Liu, B. Ding, and H. Wang. Distance-constraint reachability computation in uncertain graphs. PVLDB, 4(9):551--562, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Kempe, J. M. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. In KDD, pages 137--146, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Kleinberg. Complex networks and decentralized search algorithms. In Int. Congr. of Mathematicians (ICM), 2006.Google ScholarGoogle Scholar
  19. G. Kossinets and D. J. Watts. Empirical analysis of an evolving social network. Science, 311(5757):88--90, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  20. J. Leskovec and C. Faloutsos. Sampling from large graphs. In KDD, pages 631--636, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. In CIKM, pages 556--559, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Mahadevan, D. Krioukov, K. Fall, and A. Vahdat. Systematic topology analysis and generation using degree correlations. SIGCOMM Comput. Commun. Rev., 36(4):135--146, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Mahadevan, D. Krioukov, M. Fomenkov, X. Dimitropoulos, k. c. claffy, and A. Vahdat. The internet as-level topology: Three data sources and one definitive metric. SIGCOMM Comput. Commun. Rev., 36(1):17--26, Jan. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Mestre. Greedy in approximation algorithms. In ESA, pages 528--539, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Micali and V. V. Vazirani. An O(√|V|•|E|) algorithm for finding maximum matching in general graphs. In FOCS, pages 17--27, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Micciancio. The hardness of the closest vector problem with preprocessing. IEEE Trans. on Information Theory, 47(3):1212--1215, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Mihail and N. K. Vishnoi. On generating graphs with prescribed vertex degrees for complex network modelling. ARACNE, pages 1--11, 2002.Google ScholarGoogle Scholar
  28. M. E. Newman. Spread of epidemic disease on networks. Phys. Rev. E, 66(1), 2002.Google ScholarGoogle ScholarCross RefCross Ref
  29. M. Potamias, F. Bonchi, A. Gionis, and G. Kollios. k-Nearest Neighbors in uncertain graphs. PVLDB, 3(1):997--1008, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. Sevon, L. Eronen, P. Hintsanen, K. Kulovesi, and H. Toivonen. Link discovery in graphs derived from biological databases. In DILS, pages 35--49, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. Tizghadam and A. Leon-Garcia. Betweenness centrality and resistance distance in communication networks. IEEE Network, 24(6):10--16, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. L. G. Valiant. The complexity of enumeration and reliability problems. SIAM J. on Computing, 8(3):410--421, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  33. D. J. Watts. Collective dynamics of 'small-world' networks. Nature, 393:440--442, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  34. Y. Yuan, L. Chen, and G. Wang. Efficiently answering probability threshold-based shortest path queries over uncertain graphs. In DASFAA, pages 155--170, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Y. Yuan, G. Wang, L. Chen, and H. Wang. Efficient subgraph similarity search on large probabilistic graph databases. PVLDB, 5(9):800--811, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Y. Yuan, G. Wang, H. Wang, and L. Chen. Efficient subgraph search over large uncertain graphs. PVLDB, 4(11):876--886, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Z. Zou, J. Li, H. Gao, and S. Zhang. Finding top-k maximal cliques in an uncertain graph. In ICDE, pages 649--652, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  38. Z. Zou, J. Li, H. Gao, and S. Zhang. Mining frequent subgraph patterns from uncertain graph data. TKDE, 22(9):1203--1218, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The pursuit of a good possible world: extracting representative instances of uncertain graphs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
          June 2014
          1645 pages
          ISBN:9781450323765
          DOI:10.1145/2588555

          Copyright © 2014 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 18 June 2014

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          SIGMOD '14 Paper Acceptance Rate107of421submissions,25%Overall Acceptance Rate785of4,003submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader