skip to main content
10.1145/2457317.2457353acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

On implementing provenance-aware regular path queries with relational query engines

Published:18 March 2013Publication History

ABSTRACT

Use of graphs is growing rapidly in social networks, semantic web, biological databases, scientific workflow provenance, and other areas. Regular Path Queries (RPQs) can be seen as a core graph query language to answer pattern-based reachability queries. Unfortunately, the number of freely available systems for querying graphs using RPQs is rather limited, and available implementations do not provide direct support for a number of desirable variants of RPQs, e.g., to return those edges that are contained in some (or all) paths that match the given regular expression R. Thus, by returning not just a pair (x, y) of end points of paths that match R, but also "witness edges" (u, v) inbetween, our RPQ variants can be understood as returning additional provenance information about the answer (x, y), i.e., those edges (u, v) that are in some (or all) paths from x to y matching R. We propose a number of such RPQ variants and show how they can be implemented using either Datalog or a suitable RDBMS. Our initial experimental results indicate that RPQs and our provenance-aware variants (RPQProv), when implemented using conventional relational technologies, yield reasonable performance even for relatively large graphs. On the other hand, the overhead associated with some of these variants also makes efficient handling of provenance-aware graph queries an interesting challenge for future research.

References

  1. F. Afrati and F. Toni. Chain queries expressible by linear datalog programs. In Deductive Databases and Logic Programming (DDLP), pages 49--58, 1997.Google ScholarGoogle Scholar
  2. F. N. Afrati and J. D. Ullman. Transitive closure and recursive datalog implemented on clusters. In Proceedings of the 15th International Conference on Extending Database Technology, EDBT '12, pages 132--143, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Arenas, S. Conca, and J. Pérez. Counting beyond a yottabyte, or how sparql 1.1 property paths will prevent adoption of the standard. In Proceedings of the 21st international conference on World Wide Web, WWW '12, pages 629--638, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. Cuevas-Vicenttín, S. Dey, B. Ludäscher, M. Li Yuan Wang, and T. Song. Modeling and querying scientific workflow provenance in the d-opm. In 7th Workshop on Workflows in Support of Large-Scale Science (WORKS), november 2012.Google ScholarGoogle Scholar
  5. O. De Moor, D. Lacey, and E. Van Wyk. Universal regular path queries. Higher Order Symbol. Comput., 16(1-2):15--35, Mar. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 51(1):107--113, Jan. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Dey, S. Köhler, S. Bowers, and B. Ludäscher. Datalog as a lingua franca for provenance querying and reasoning. In Workshop on the Theory and Practice of Provenance (TaPP), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. Fan, X. Wang, and Y. Wu. Performance guarantees for distributed reachability queries. Proc. VLDB Endow., 5(11):1304--1316, July 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Kabler, Y. E. Ioannidis, and M. J. Carey. Performance evaluation of algorithms for transitive closure. Inf. Syst., 17(5):415--441, Sept. 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Koschmieder and U. Leser. Regular path queries on large graphs. In Proceedings of the 24th international conference on Scientific and Statistical Database Management, SSDBM'12, pages 177--194, Berlin, Heidelberg, 2012. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. A. Liu, T. Rothamel, F. Yu, S. D. Stoller, and N. Hu. Parametric regular path queries. In Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation, PLDI '04, pages 219--230, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, SIGMOD '10, pages 135--146, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. O. Mendelzon and P. T. Wood. Finding regular simple paths in graph databases. SIAM J. Comput., 24(6):1235--1258, Dec. 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Moreau, B. Clifford, J. Freire, J. Futrelle, Y. Gil, P. Groth, N. Kwasnikowska, S. Miles, P. Missier, J. Myers, B. Plale, Y. Simmhan, E. Stephan, and J. V. den Bussche. The open provenance model core specification (v1.1). Future Gener. Comput. Syst., 27(6):743--756, June 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. D. Ullman and A. Van Gelder. Parallel complexity of logical query programs. In Proceedings of the 27th Annual Symposium on Foundations of Computer Science, SFCS '86, pages 438--454, Washington, DC, USA, 1986. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. L. Y. Wang. Querying Provenance as Regular Path Queries with Relational Databases. Master's thesis, University of California, Davis, 2012.Google ScholarGoogle Scholar

Index Terms

  1. On implementing provenance-aware regular path queries with relational query engines

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Other conferences
                EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 Workshops
                March 2013
                423 pages
                ISBN:9781450315999
                DOI:10.1145/2457317

                Copyright © 2013 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 18 March 2013

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                EDBT '13 Paper Acceptance Rate7of10submissions,70%Overall Acceptance Rate7of10submissions,70%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader