ABSTRACT
Our world today is generating huge amounts of graph data such as social networks, biological networks, and the semantic web. Many of these real-world graphs are edge-labeled graphs, i.e., each edge has a label that denotes the relationship between the two vertices connected by the edge. A fundamental research problem on these labeled graphs is how to handle the label-constraint reachability query: Can vertex u reach vertex v through a path whose edge labels are constrained by a set of labels? In this work, we introduce a novel tree-based index framework which utilizes the directed maximal weighted spanning tree algorithm and sampling techniques to maximally compress the generalized transitive closure for the labeled graphs. An extensive experimental evaluation on both real and synthetic datasets demonstrates the efficiency of our approach in answering label-constraint reachability queries.
- Serge Abiteboul and Victor Vianu. Regular path queries with constraints. In PODS, pages 122--133, 1997. Google ScholarDigital Library
- R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. In SIGMOD, pages 253--262, 1989. Google ScholarDigital Library
- Ian Anderson. Combinatorics of Finite Sets. Clarendon Press, Oxford, 1987.Google Scholar
- A. L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.Google ScholarCross Ref
- M.de Berg, M.van Kreveld, M.Overmars, and O.Schwarzkopf. Computational Geometry. Springer, 2000.Google Scholar
- Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. R-mat: A recursive model for graph mining. In Fourth SIAM International Conference on Data Mining, 2004.Google ScholarCross Ref
- Y. J. Chu and T. H. Liu. On the shortest arborescence of a directed graph. Science Sinica, 14:1396--1400, 1965.Google Scholar
- Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest. Introduction to Algorithms. McGraw Hill, 1990. Google ScholarDigital Library
- J. Edmonds. Optimum branchings. J. Research of the National Bureau of Standards, 71B:233--240, 1967.Google ScholarCross Ref
- Gang Gou and Rada Chirkova. Efficiently querying large xml data repositories: A survey. IEEE Trans. Knowl. Data Eng., 19(10):1381--1403, 2007. Google ScholarDigital Library
- V. Heidrich-Meisner and C. Igel. Hoeffding and bernstein races for selecting policies in evolutionary direct policy search. In ICML '09. Google ScholarDigital Library
- W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13--30, March 1963.Google ScholarCross Ref
- Thorsson, V., Ranish, J. A., Christmas, R., Buhler, J., Eng J. K., Bumgarner, R., Goodlett, D. R., Aebersold, R., Hood, L., Ideker, T. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. In Science, pages 929--934, 2001.Google Scholar
- R. Jin, H. Hong, H. Wang, N. Ruan, and Y. Xiang. Computing label-constraint reachability in graph databases. Technical Report TR-KSU-CS-2010-1, Computer Science, Kent State University, March 2010.Google Scholar
- R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. SIGMOD, 2009. Google ScholarDigital Library
- R. Jin, Y. Xiang, N. Ruan, and H. Wang. Efficiently answering reachability queries on very large directed graphs. In SIGMOD Conference, pages 595--608, 2008. Google ScholarDigital Library
- Richard Johnsonbaugh and Martin Kalin. A graph generation software package. In SIGCSE, pages 151--154, 1991. Google ScholarDigital Library
- Rinaldi, N. J., Rober,t F., Odom, D. T., Bar-Joseph, Z., Gerber, G. K., Hannett, N. M., Harbison, C. R., Thompson, C. M., Simon, I., Zeitlinger, J., Jennings, E. G., Murray, H. L., Gordon, D. B., Ren, B., Wyrick, J. J., Tagne, J. Volkert, T. L., Fraenkel, E., Gifford, D. K., Lee, T. I., and R. A. Young. Transcriptional regulatory networks in saccharomyces cerevisiae. In Science, pages 799--804, 2002.Google Scholar
- Jure Leskovec, Ajit Singh, and Jon M. Kleinberg. Patterns of influence in a recommendation network. In PAKDD '06. Google ScholarDigital Library
- Alberto O. Mendelzon and Peter T. Wood. Finding regular simple paths in graph databases. SIAM J. Comput., 24(6), 1995. Google ScholarDigital Library
- Tova Milo and Dan Suciu. Index structures for path expressions. In ICDT, pages 277--295, 1999. Google ScholarDigital Library
- Volodymyr Mnih, Csaba Szepesvári, and Jean-Yves Audibert. Empirical bernstein stopping. In ICML '08. Google ScholarDigital Library
- M. E. J. Newman. Power laws, pareto distributions and zipf's law. Contemporary Physics, 46:323, 2005.Google ScholarCross Ref
- Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. Yago: a core of semantic knowledge. In WWW, pages 697--706, 2007. Google ScholarDigital Library
- Haixun Wang, Hao He, Jun Yang, Philip S. Yu, and Jeffrey Xu Yu. Dual labeling: Answering graph reachability queries in constant time. In ICDE '06. Google ScholarDigital Library
Index Terms
- Computing label-constraint reachability in graph databases
Recommendations
Graph Indexing for Efficient Evaluation of Label-constrained Reachability Queries
Best of PODS 2019 and Regular PapersGiven a directed edge labeled graph G, to check whether vertex v is reachable from vertex u under a label set S is to know if there is a path from u to v whose edge labels across the path are a subset of S. Such a query is referred to as a label-...
Efficiently answering reachability queries on very large directed graphs
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of dataEfficiently processing queries against very large graphs is an important research topic largely driven by emerging real world applications, as diverse as XML databases, GIS, web mining, social network analysis, ontologies, and bioinformatics. In ...
A hierarchical model for label constraint reachability computation
Reachability query is a fundamental problem on networks currently emerging in various application domains. However, very little existing work has studied reachability with label constraints imposed on the edges/vertices of graphs. In this paper, we ...
Comments