Abstract
Given a large directed graph, rapidly answering reachability queries between source and target nodes is an important problem. Existing methods for reachability trade-off indexing time and space versus query time performance. However, the biggest limitation of existing methods is that they simply do not scale to very large real-world graphs. We present a very simple, but scalable reachability index, called GRAIL, that is based on the idea of randomized interval labeling, and that can effectively handle very large graphs. Based on an extensive set of experiments, we show that while more sophisticated methods work better on small graphs, GRAIL is the only index that can scale to millions of nodes and edges. GRAIL has linear indexing time and space, and the query time ranges from constant time to being linear in the graph order and size.
- R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. SIGMOD Rec., 18(2):253--262, 1989. Google ScholarDigital Library
- P. Bouros, S. Skiadopoulos, T. Dalamagas, D. Sacharidis, and T. Sellis. Evaluating reachability queries over path collections. In SSDBM, page 416, 2009. Google ScholarDigital Library
- R. Bramandia, B. Choi, and W. K. Ng. On incremental maintenance of 2-hop labeling of graphs. In WWW, 2008. Google ScholarDigital Library
- Y. Chen. General spanning trees and reachability query evaluation. In Canadian Conference on Computer Science and Software Engineering, Montreal, Quebec, Canada, 2009. Google ScholarDigital Library
- Y. Chen and Y. Chen. An efficient algorithm for answering graph reachability queries. In ICDE, 2008. Google ScholarDigital Library
- J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computing reachability labelings for large graphs with high compression rate. In EBDT, 2008. Google ScholarDigital Library
- E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SIAM Journal of Computing, 32(5):1335--1355, 2003. Google ScholarDigital Library
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. MIT Press, 2001. Google ScholarDigital Library
- C. Demetrescu and G. Italiano. Fully Dynamic Transitive Closure: Breaking through the O(n 2) Barrier. In FOCS, 2000. Google ScholarDigital Library
- C. Demetrescu and G. Italiano. Dynamic shortest paths and transitive closure: Algorithmic techniques and data structures. Journal of Discrete Algorithms, 4(3):353--383, 2006.Google ScholarCross Ref
- P. F. Dietz. Maintaining order in a linked list. In STOC, 1982. Google ScholarDigital Library
- H. He, H. Wang, J. Yang, and P. S. Yu. Compact reachability labeling for graph-structured data. In CIKM, 2005. Google ScholarDigital Library
- H. V. Jagadish. A compression technique to materialize transitive closure. ACM Trans. Database Syst., 15(4):558--598, 1990. Google ScholarDigital Library
- R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. In SIGMOD, 2009. Google ScholarDigital Library
- R. Jin, Y. Xiang, N. Ruan, and H. Wang. Efficient answering reachability queries on very large directed graphs. In SIGMOD, 2008. Google ScholarDigital Library
- V. King and G. Sagert. A fully dynamic algorithm for maintaining the transitive closure. J. Comput. Syst. Sci., 65(1):150--167, 2002. Google ScholarDigital Library
- I. Krommidas and C. Zaroliagis. An experimental study of algorithms for fully dynamic transitive closure. Journal of Experimental Algorithmics, 12:16, 2008. Google ScholarDigital Library
- L. Roditty and U. Zwick. A fully dynamic reachability algorithm for directed graphs with an almost linear update time. In STOC, 2004. Google ScholarDigital Library
- R. Schenkel, A. Theobald, and G. Weikum. HOPI: an efficient connection index for complex XML document collections. In EBDT, 2004.Google ScholarCross Ref
- R. Schenkel, A. Theobald, and G. Weikum. Efficient creation and incremental maintenance of the hopi index for complex xml document collections. In ICDE, 2005. Google ScholarDigital Library
- S. Trissl and U. Leser. Fast and practical indexing and querying of very large graphs. In SIGMOD, 2007. Google ScholarDigital Library
- H. Wang, H. He, J. Yang, P. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In ICDE, 2006. Google ScholarDigital Library
- J. X. Yu, X. Lin, H. Wang, P. S. Yu, and J. Cheng. Fast computation of reachability labeling for large graphs. In EBDT, 2006.Google Scholar
Index Terms
- GRAIL: scalable reachability index for large graphs
Recommendations
GRAIL: a scalable index for reachability queries in very large graphs
Given a large directed graph, rapidly answering reachability queries between source and target nodes is an important problem. Existing methods for reachability tradeoff indexing time and space versus query time performance. However, the biggest ...
GRAIL: efficient time-series representation learning
The analysis of time series is becoming increasingly prevalent across scientific disciplines and industrial applications. The effectiveness and the scalability of time-series mining techniques critically depend on design choices for three components ...
On the Multichromatic Number of s-Stable Kneser Graphs
For positive integers n and s, a subset Sï [n] is s-stable if sï |i-j|ï n-s for distinct i,j∈S . The s-stable r-uniform Kneser hypergraph KGrn,ks-stable is the r-uniform hypergraph that has the collection of all s-stable k-element subsets of [n] as ...
Comments