skip to main content
10.1145/1807167.1807183acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Computing label-constraint reachability in graph databases

Published:06 June 2010Publication History

ABSTRACT

Our world today is generating huge amounts of graph data such as social networks, biological networks, and the semantic web. Many of these real-world graphs are edge-labeled graphs, i.e., each edge has a label that denotes the relationship between the two vertices connected by the edge. A fundamental research problem on these labeled graphs is how to handle the label-constraint reachability query: Can vertex u reach vertex v through a path whose edge labels are constrained by a set of labels? In this work, we introduce a novel tree-based index framework which utilizes the directed maximal weighted spanning tree algorithm and sampling techniques to maximally compress the generalized transitive closure for the labeled graphs. An extensive experimental evaluation on both real and synthetic datasets demonstrates the efficiency of our approach in answering label-constraint reachability queries.

References

  1. Serge Abiteboul and Victor Vianu. Regular path queries with constraints. In PODS, pages 122--133, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. In SIGMOD, pages 253--262, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ian Anderson. Combinatorics of Finite Sets. Clarendon Press, Oxford, 1987.Google ScholarGoogle Scholar
  4. A. L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  5. M.de Berg, M.van Kreveld, M.Overmars, and O.Schwarzkopf. Computational Geometry. Springer, 2000.Google ScholarGoogle Scholar
  6. Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. R-mat: A recursive model for graph mining. In Fourth SIAM International Conference on Data Mining, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  7. Y. J. Chu and T. H. Liu. On the shortest arborescence of a directed graph. Science Sinica, 14:1396--1400, 1965.Google ScholarGoogle Scholar
  8. Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest. Introduction to Algorithms. McGraw Hill, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Edmonds. Optimum branchings. J. Research of the National Bureau of Standards, 71B:233--240, 1967.Google ScholarGoogle ScholarCross RefCross Ref
  10. Gang Gou and Rada Chirkova. Efficiently querying large xml data repositories: A survey. IEEE Trans. Knowl. Data Eng., 19(10):1381--1403, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. V. Heidrich-Meisner and C. Igel. Hoeffding and bernstein races for selecting policies in evolutionary direct policy search. In ICML '09. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13--30, March 1963.Google ScholarGoogle ScholarCross RefCross Ref
  13. Thorsson, V., Ranish, J. A., Christmas, R., Buhler, J., Eng J. K., Bumgarner, R., Goodlett, D. R., Aebersold, R., Hood, L., Ideker, T. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. In Science, pages 929--934, 2001.Google ScholarGoogle Scholar
  14. R. Jin, H. Hong, H. Wang, N. Ruan, and Y. Xiang. Computing label-constraint reachability in graph databases. Technical Report TR-KSU-CS-2010-1, Computer Science, Kent State University, March 2010.Google ScholarGoogle Scholar
  15. R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Jin, Y. Xiang, N. Ruan, and H. Wang. Efficiently answering reachability queries on very large directed graphs. In SIGMOD Conference, pages 595--608, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Richard Johnsonbaugh and Martin Kalin. A graph generation software package. In SIGCSE, pages 151--154, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Rinaldi, N. J., Rober,t F., Odom, D. T., Bar-Joseph, Z., Gerber, G. K., Hannett, N. M., Harbison, C. R., Thompson, C. M., Simon, I., Zeitlinger, J., Jennings, E. G., Murray, H. L., Gordon, D. B., Ren, B., Wyrick, J. J., Tagne, J. Volkert, T. L., Fraenkel, E., Gifford, D. K., Lee, T. I., and R. A. Young. Transcriptional regulatory networks in saccharomyces cerevisiae. In Science, pages 799--804, 2002.Google ScholarGoogle Scholar
  19. Jure Leskovec, Ajit Singh, and Jon M. Kleinberg. Patterns of influence in a recommendation network. In PAKDD '06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Alberto O. Mendelzon and Peter T. Wood. Finding regular simple paths in graph databases. SIAM J. Comput., 24(6), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tova Milo and Dan Suciu. Index structures for path expressions. In ICDT, pages 277--295, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Volodymyr Mnih, Csaba Szepesvári, and Jean-Yves Audibert. Empirical bernstein stopping. In ICML '08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. E. J. Newman. Power laws, pareto distributions and zipf's law. Contemporary Physics, 46:323, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  24. Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. Yago: a core of semantic knowledge. In WWW, pages 697--706, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Haixun Wang, Hao He, Jun Yang, Philip S. Yu, and Jeffrey Xu Yu. Dual labeling: Answering graph reachability queries in constant time. In ICDE '06. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Computing label-constraint reachability in graph databases

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
      June 2010
      1286 pages
      ISBN:9781450300322
      DOI:10.1145/1807167

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 June 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader