Abstract
Driven by the emerging network applications, querying and mining uncertain graphs has become increasingly important. In this paper, we investigate a fundamental problem concerning uncertain graphs, which we call the distance-constraint reachability (DCR) problem: Given two vertices s and t, what is the probability that the distance from s to t is less than or equal to a user-defined threshold d in the uncertain graph? Since this problem is #P-Complete, we focus on efficiently and accurately approximating DCR online. Our main results include two new estimators for the probabilistic reachability. One is a Horvitz-Thomson type estimator based on the unequal probabilistic sampling scheme, and the other is a novel recursive sampling estimator, which effectively combines a deterministic recursive computational procedure with a sampling process to boost the estimation accuracy. Both estimators can produce much smaller variance than the direct sampling estimator, which considers each trial to be either 1 or 0. We also present methods to make these estimators more computationally efficient. The comprehensive experiment evaluation on both real and synthetic datasets demonstrates the efficiency and accuracy of our new estimators.
- C. C. Aggarwal, editor. Managing and Mining Uncertain Data. Advances in Database Systems. Springer, 2009. Google ScholarDigital Library
- M. O. Ball. Computational complexity of network reliability analysis: An overview. IEEE Transactions on Reliability, 35:230--239, 1986.Google ScholarCross Ref
- A. L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.Google ScholarCross Ref
- I. Clarke, O. Sandberg, B. Wiley, and T. W. Hong. Freenet: a distributed anonymous information storage and retrieval system. In International workshop on Designing privacy enhancing technologies: design issues in anonymity and unobservability, pages 46--66, 2001. Google ScholarDigital Library
- C. J. Colbourn. The Combinatorics of Network Reliability. Oxford University Press, Inc., 1987. Google ScholarDigital Library
- G. S. Fishman. A comparison of four monte carlo methods for estimating the probability of s-t connectedness. IEEE Transactions on Reliability, 35(2):145--155, 1986.Google ScholarCross Ref
- M. Hua and J. Pei. Probabilistic path queries in road networks: traffic uncertainty aware path selection. In EDBT, pages 347--358, 2010. Google ScholarDigital Library
- R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. In SIGMOD Conference, pages 813--826, 2009. Google ScholarDigital Library
- R. M. Karp and M. G. Luby. A new monte-carlo method for estimating the failure probability of an n-component system. Technical report, Berkeley, CA, USA, 1983. Google ScholarDigital Library
- H. Kumamoto, K. Tanaka, I. Koichi, and H. E. J. Dagger-sampling monte carlo for system unavailability evaluation. IEEE Transactions on Reliability, 29(2):122--125.Google ScholarCross Ref
- G. Pandurangan, P. Raghavan, and E. Upfal. Building low-diameter p2p networks. In FOCS, pages 492--499, 2001. Google ScholarDigital Library
- L. Petingi. Combinatorial and computational properties of a diameter constrained network reliability model. In Proceedings of the WSEAS International Conference on Applied Computing Conference, ACC'08, 2008. Google ScholarDigital Library
- M. Potamias, F. Bonchi, A. Gionis, and G. Kollios. k-nearest neighbors in uncertain graphs. PVLDB, 3(1):997--1008, 2010. Google ScholarDigital Library
- G. Rubino. Network reliability evaluation, pages 275--302. Gordon and Breach Science Publishers, Inc., Newark, NJ, USA, 1999. Google ScholarDigital Library
- S. J. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Pearson Education, 2003. Google ScholarDigital Library
- A. Satyanarayana and R. K. Wood. A linear-time algorithm for computing k-terminal reliability in series-parallel networks. SIAM Journal on Computing, 14(4):818--832, 1985.Google ScholarCross Ref
- G. Swamynathan, C. Wilson, B. Boe, K. Almeroth, and B. Y. Zhao. Do social networks improve e-commerce?: a study on social marketplaces. In WOSP '08. Google ScholarDigital Library
- S. Thompson. Sampling, Second Edition. New York: Wiley., 2002.Google Scholar
- http://mips.helmholtz-muenchen.de/genre/proj/mpact/.Google Scholar
- http://thebiogrid.org/.Google Scholar
- L. G. Valiant. The complexity of enumeration and reliability problems. SIAM Journal on Computing, 8(3):410--421, 1979.Google ScholarDigital Library
- H. Yildirim, V. Chaoji, and M. J. Zaki. Grail: Scalable reachability index for large graphs. PVLDB, 3(1):276--284, 2010. Google ScholarDigital Library
- Y. Yuan, L. Chen, and G. Wang. Efficiently answering probability threshold-based shortest path queries over uncertain graphs. In DASFAA (1) '10, pages 155--170, 2010. Google ScholarDigital Library
- Z. Zou, H. Gao, and J. Li. Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. KDD '10, pages 633--642, 2010. Google ScholarDigital Library
- Z. Zou, J. Li, H. Gao, and S. Zhang. Finding top-k maximal cliques in an uncertain graph. In ICDE, pages 649--652, 2010.Google ScholarCross Ref
Index Terms
- Distance-constraint reachability computation in uncertain graphs
Recommendations
On the connectivity of bipartite distance-balanced graphs
A connected graph @C is said to be distance-balanced whenever for any pair of adjacent vertices u,v of @C the number of vertices closer to u than to v is equal to the number of vertices closer to v than to u. In [K. Handa, Bipartite graphs with balanced ...
A note on connected dominating sets of distance-hereditary graphs
A vertex subset of a graph is a dominating set if every vertex of the graph belongs to the set or has a neighbor in it. A connected dominating set is a dominating set such that the induced subgraph of the set is a connected graph. A graph is called ...
Colouring exact distance graphs of chordal graphs
AbstractFor a graph G = ( V , E ) and positive integer p, the exact distance- p graph G [ ♮ p ] is the graph with vertex set V and with an edge between vertices x and y if and only if x and y have distance p. ...
Comments