skip to main content
10.1145/1376616.1376660acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Graphs-at-a-time: query language and access methods for graph databases

Authors Info & Claims
Published:09 June 2008Publication History

ABSTRACT

With the prevalence of graph data in a variety of domains, there is an increasing need for a language to query and manipulate graphs with heterogeneous attributes and structures. We propose a query language for graph databases that supports arbitrary attributes on nodes, edges, and graphs. In this language, graphs are the basic unit of information and each query manipulates one or more collections of graphs. To allow for flexible compositions of graph structures, we extend the notion of formal languages from strings to the graph domain. We present a graph algebra extended from the relational algebra in which the selection operator is generalized to graph pattern matching and a composition operator is introduced for rewriting matched graphs. Then, we investigate access methods of the selection operator. Pattern matching over large graphs is challenging due to the NP-completeness of subgraph isomorphism. We address this by a combination of techniques: use of neighborhood subgraphs and profiles, joint reduction of the search space, and optimization of the search order. Experimental results on real and synthetic large graphs demonstrate that our graph specific optimizations outperform an SQL-based implementation by orders of magnitude.

References

  1. S. Asthana et al. Predicting protein complex membership using probabilistic network reliability. Genome Research, May 2004.Google ScholarGoogle ScholarCross RefCross Ref
  2. S. Berretti, A. D. Bimbo, and E. Vicario. Efficient matching and indexing of graph models in content-based retrieval. In IEEE Trans. on Pattern Analysis and Machine Intelligence, volume 23, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Boag, D. Chamberlin, M. F. Fernández, D. Florescu, J. Robie, and J. Siméon. XQuery 1.0: An XML query language. W3C, http://www.w3.org/TR/xquery/, 2007.Google ScholarGoogle Scholar
  4. C. Branden and J. Tooze. Introduction to protein structure. Garland, 2 edition, 1998.Google ScholarGoogle Scholar
  5. S. Chaudhuri. An overview of query optimization in relational systems. In PODS, pages 34--43, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In Proc. of VLDB '05, pages 493--504, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Cheng, Y. Ke, W. Ng, and A. Lu. FG-Index: towards verification-free query processing on graph databases. In Proc. of SIGMOD '07, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In EDBT, pages 961--979, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SIAM J. Comput., 32(5):1338--1355, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. P. Consens and A. O. Mendelzon. GraphLog: a visual formalism for real life recursion. In PODS, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Erdõs and A. Rényi. On random graphs I. Publ. Math. Debrecen, (6):290--297.Google ScholarGoogle Scholar
  12. Gene Ontology. http://www.geneontology.org/.Google ScholarGoogle Scholar
  13. R. H. Guting. GraphDB: Modeling and querying graphs in databases. In Proc. of VLDB'94, pages 297--308, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Gyssens, J. Paredaens, and D. van Gucht. A graph-oriented object database model. In Proc. of PODS '90, pages 417--424, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. He and A. K. Singh. Closure-Tree: An Index Structure for Graph Queries. In Proc. of ICDE'06, Atlanta, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Hopcroft and R. Karp. An n 5/2 algorithm for maximum matchings in bipartite graphs. SIAM J. Computing, 1973.Google ScholarGoogle ScholarCross RefCross Ref
  17. J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison Wesley, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. V. Jagadish, L. V. S. Lakshmanan, D. Srivastava, and K. Thompson. TAX: A tree algebra for XML. In Proc. of DBPL'01, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Jiang, H. Wang, P. S. Yu, and S. Zhou. GString: A novel approach for efficient search in graph databases. In ICDE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  20. J. Lee, J. Oh, and S. Hwang. STRG-Index: Spatio-temporal region graph indexing for large video databases. In Proc. of SIGMOD, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. U. Leser. A query language for biological networks. Bioinformatics, 21:ii33--ii39, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. F. Manola and E. Miller. RDF Primer. W3C, http://www.w3.org/TR/rdf-primer/, 2004.Google ScholarGoogle Scholar
  23. E. Prud'hommeaux and A. Seaborne. SPARQL query language for RDF. W3C, http://www.w3.org/TR/rdf-sparql-query/, 2007.Google ScholarGoogle Scholar
  24. R. Ramakrishnan and J. Gehrke. Database Management Systems, chapter 24 Deductive Databases. McGraw-Hill, third edition, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Rekers and A. Schurr. A graph grammar approach to graphical parsing. In 11th International IEEE Symposium on Visual Languages, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Rozenberg (Ed.). Handbook on Graph Grammars and Computing by Graph Transformation: Foundations, volume 1. World Scientific, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Schenkel, A. Theobald, and G. Weikum. Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In Proc. of ICDE '05, pages 360--371, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. N. Shadbolt, T. Berners-Lee, and W. Hall. The semantic web revisited. IEEE Intelligent Systems, 21(3):96--101, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In Proc. of PODS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. L. Sheng, Z. M. Ozsoyoglu, and G. Ozsoyoglu. A graph query language and its query processing. In ICDE, 1999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Y. Tian, R. C. McEachin, C. Santos, D. J. States, and J. M. Patel. SAGA: a subgraph matching tool for biological graphs. Bioinformatics, 23(2), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Tri$ßl and U. Leser. Fast and practical indexing and querying of very large graphs. In Proc. of SIGMOD '07, pages 845--856, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In Proc. of ICDE '06, page 75, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. D. W. Williams, J. Huan, and W. Wang. Graph database indexing using structured graph decomposition. In ICDE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  35. X. Yan, P. S. Yu, and J. Han. Graph Indexing: A frequent structure-based approach. In Proc. of SIGMOD, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. Zhang, M. Hu, and J. Yang. TreePi: A novel graph indexing method. In ICDE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  37. P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree delta >= graph. In Proc. of VLDB, pages 938--949, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Graphs-at-a-time: query language and access methods for graph databases

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data
          June 2008
          1396 pages
          ISBN:9781605581026
          DOI:10.1145/1376616

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 June 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate785of4,003submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader