skip to main content
research-article

Expressive Languages for Path Queries over Graph-Structured Data

Published:01 December 2012Publication History
Skip Abstract Section

Abstract

For many problems arising in the setting of graph querying (such as finding semantic associations in RDF graphs, exact and approximate pattern matching, sequence alignment, etc.), the power of standard languages such as the widely studied conjunctive regular path queries (CRPQs) is insufficient in at least two ways. First, they cannot output paths and second, more crucially, they cannot express relationships among paths.

We thus propose a class of extended CRPQs, called ECRPQs, which add regular relations on tuples of paths, and allow path variables in the heads of queries. We provide several examples of their usefulness in querying graph structured data, and study their properties. We analyze query evaluation and representation of tuples of paths in the output by means of automata. We present a detailed analysis of data and combined complexity of queries, and consider restrictions that lower the complexity of ECRPQs to that of relational conjunctive queries. We study the containment problem, and look at further extensions with first-order features, and with nonregular relations that add arithmetic constraints on the lengths of paths and numbers of occurrences of labels.

Skip Supplemental Material Section

Supplemental Material

References

  1. Abiteboul, S., Quass, D., McHugh, J., Widom, J., and Wiener, J. 1997. The LOREL query language for semistructured data. Int. J. Digital Libraries 1, 1, 68--88.Google ScholarGoogle ScholarCross RefCross Ref
  2. Abiteboul, S., Buneman, P., and Suciu, D. 1999. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kauffman. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Abulla, P., Jonnson, B., Nilsson, M., and Saksena, M. 2003. A survey of regular model checking. In Proceedings of the 5th International Conference on Concurrence Theory. 35--48.Google ScholarGoogle Scholar
  4. Alkhateeb, F., Baget, J.-F., and Euzenat, J. 2008. Constrained regular expressions in SPARQL. In Proceedings of the International Conference on Semantic Web & Web Services. 91--99.Google ScholarGoogle Scholar
  5. Alkhateeb, F., Baget, J.-F., and Euzenat, J. 2009. Extending SPARQL with regular expression patterns (for querying RDF). J. Web Semantics 7, 2, 57--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Anyanwu, K. and Sheth, A. 2003. ρ-queries: Enabling querying for semantic associations on the semantic web. In Proceedings of the 12th International World Wide Web Conference. 690--699. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Anyanwu, K., Maduko, A., and Sheth, A. P. 2007. SPARQ2L: Towards support for subgraph extraction queries in RDF databases. In Proceedings of the 16th International World Wide Web Conference. 797--806. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Barceló, P., Hurtado, C. A., Libkin, L., and Wood, P. T. 2010. Expressive languages for path queries over graph-structured data. In Proceedings of the 29th ACM Symposium on Principles of Database Systems. 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Barceló, P., Pérez, J., and Reutter, J. 2012. Relative expressiveness of nested regular expressions. In Proceedings of the 6th Alberto Mendelzon Workshop on the Foundations of Data Management and the Web. 180--195.Google ScholarGoogle Scholar
  10. Barrett, C., Jacob, R., and Marathe, M. 2000. Formal-language-constrained path problems. SIAM J. Comput. 30, 3, 809--837. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Benedikt, M., Libkin, L., Schwentick, T., and Segoufin, L. 2003. Definable relations and first-order query languages over strings. J. ACM 50, 5, 694--751. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Berstel, J. 1979. Transductions and Context-Free Languages. B. G. Teubner.Google ScholarGoogle Scholar
  13. Blumensath, A. and Grädel, E. 2000. Automatic structures. In Proceedings of the 15th Annual IEEE Symposium on Logic in Computer Science. 51--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Bruyère, V., Hansel, G., Michaux, C., and Villemaire, R. 1994. Logic and p-recognizable sets of integers. Bull. Belgian Math. Soc. 1, 191--238.Google ScholarGoogle ScholarCross RefCross Ref
  15. Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. 2000. Containment of conjunctive regular path queries with inverse. In Proceedings of the 7th International Conference on Principles of Knowledge Representation and Reasoning. 176--185.Google ScholarGoogle Scholar
  16. Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. 2002. Rewriting of regular expressions and regular path queries. J. Comput. Syst. Sci. 64, 3, 443--465.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chrobak, M. 1986. Finite automata and unary languages. Theor. Comput. Sci. 47, 2, 149--158. Google ScholarGoogle ScholarCross RefCross Ref
  18. Consens, M. and Mendelzon, A. 1990. GraphLog: A visual formalism for real life recursion. In Proceedings of the 9th ACM Symposium on Principles of Database Systems. 404--416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Deutsch, A. and Tannen, V. 2001. Optimization properties for classes of conjunctive regular path queries. In Proceedings of the 8th International Workshop on Database Programming Languages. 21--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Elgot, C. and Mezei, J. 1965. On relations defined by generalized finite automata. IBM J. Resear. Dev. 9, 1, 47--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Florescu, D., Levy, A., and Suciu, D. 1998. Query containment for conjunctive queries with regular expressions. In Proceedings of the 17th ACM Symposium on Principles of Database Systems. 139--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Freydenberger, D. and Reidenbach, D. 2010. Bad news on decision problems for patterns. Inf. Comput. 208, 1, 83--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Freydenberger, D. and Schweikardt, N. 2011. Expressiveness and static analysis of extended conjunctive regular path queries. In Proceedings of the 5th Alberto Mendelzon International Workshop on Foundations of Data Management.Google ScholarGoogle Scholar
  24. Frougny, C. and Sakarovitch, J. 1991. Rational relations with bounded delay. In Proceedings of the 8th Annual Symposium on Theoretical Aspects of Computer Science. 50--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Grahne, G. and Thomo, A. 2004. Query answering and containment for regular path queries under distortions. In Proceedings of the 3rd International Symposium on the Foundations of Information and Knowledge Systems. 98--115.Google ScholarGoogle Scholar
  26. Gusfield, D. 1997. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Holland, D., Braun, U., Maclean, D., Muniswamy-Reddy, K., and Seltzer, M. 2008. Choosing a data model and query language for provenance. In Proceedings of the 2nd International Provenance and Annotation Workshop.Google ScholarGoogle Scholar
  28. Ibarra, H., Su, J., Dang, Z., Bultan, T., and Kemmerer, R. 2002. Counter machines and verification problems. Theor. Comput. Sci. 289, 1, 165--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kanza, Y. and Sagiv, Y. 2001. Flexible queries over semistructured data. In Proceedings of the 20th ACM Symposium on Principles of Database Systems. 40--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kochut, K. and Janik, M. 2007. SPARQLeR: Extended SPARQL for semantic association discovery. In Proceedings of the 4th European Semantic Web Conference. 145--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Kozen, D. 1977. Lower bounds for natural proof systems. In Proceedings of the 18th Annual Symposium on Foundations of Computer Science. 254--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Lee, W., Raschid, L., Srinivasan, P., Shah, N., Rubin, D., and Noy, N. 2007. Using annotations from controlled vocabularies to find meaningful associations. In Proceedings of the 4th International Workshop on Data Integration in the Life Sciences. 247--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Lehmann, J., Schüpell, J., and Auer, S. 2007. Discovering unknown connections---the DBpedia relationship finder. In Proceedings of the 1st SABRE Conference on Social Semantic Web. 99--110.Google ScholarGoogle Scholar
  34. Lenstra, H. 1983. Integer programming in a fixed number of variables. Math. Oper. Resear. 8, 4, 538--548.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Mendelzon, A. and Wood, P. 1995. Finding regular simple paths in graph databases. SIAM J. Comput. 24, 6, 1235--1258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Milo, T. and Suciu, D. 1999. Index structures for path expressions. In Proceedings of the 7th International Conference on Database Theory. 277--295. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Papadimitriou, C. 1981. On the complexity of integer programming. J. ACM 28, 4, 765--768. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Pérez, J., Arenas, M., and Gutierrez, C. 2010. nSPARQL: A navigational language for RDF. J. Web Semantics 8, 4, 255--270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Robertson, E. L. 1974. Structure of complexity in the weak monadic second-order theories of the natural numbers. In Conference Record of 6th Annual ACM Symposium on Theory of Computing. 161--171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Scarpellini, B. 1984. Complexity of subcases of Presburger arithmetic. Trans. AMS 284, 203--218.Google ScholarGoogle ScholarCross RefCross Ref
  41. Sheth, A., Aleman-Meza, A. B., et al. 2005. Semantic association identification and knowledge discovery for national security applications. J. Datab. Manage. 16, 1, 33--53.Google ScholarGoogle ScholarCross RefCross Ref
  42. Stockmeyer, L. J. 1974. The complexity of decision problems in automata theory and logic. Ph.D. thesis, Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  43. To, A. 2009. Unary finite automata vs. arithmetic progressions. Inf. Process. Lett. 109, 17, 1010--1014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. To, A. 2010. Model checking infinite-state systems: Generic and specific approaches. Ph.D. thesis, LFCS, School of Informatics, University of Edinburgh.Google ScholarGoogle Scholar
  45. Verma, K., Seidl, H., and Schwentick, T. 2005. On the complexity of equational Horn clauses. In Proceedings of the 20th International Conference on Automated Deduction (CADE). 337--352. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Weikum, G., Kasneci, G., Ramanath, M., and Suchanek, F. 2009. Database and information-retrieval methods for knowledge discovery. Commun. ACM 52, 4, 56--64. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Expressive Languages for Path Queries over Graph-Structured Data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Database Systems
        ACM Transactions on Database Systems  Volume 37, Issue 4
        December 2012
        345 pages
        ISSN:0362-5915
        EISSN:1557-4644
        DOI:10.1145/2389241
        Issue’s Table of Contents

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 December 2012
        • Accepted: 1 August 2011
        • Revised: 1 June 2011
        • Received: 1 February 2011
        Published in tods Volume 37, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader