skip to main content
10.1145/2967973.2968604acmotherconferencesArticle/Chapter ViewAbstractPublication PagesppdpConference Proceedingsconference-collections
research-article

Language-integrated provenance

Published:05 September 2016Publication History

ABSTRACT

Provenance, or information about the origin or derivation of data, is important for assessing the trustworthiness of data and identifying and correcting mistakes. Most prior implementations of data provenance have involved heavyweight modifications to database systems and little attention has been paid to how the provenance data can be used outside such a system. We present extensions to the Links programming language that build on its support for language-integrated query to support provenance queries by rewriting and normalizing monadic comprehensions and extending the type system to distinguish provenance metadata from normal data. The main contribution of this paper is to show that the two most common forms of provenance can be implemented efficiently and used safely as a programming language feature with no changes to the database system.

References

  1. Y. Amsterdamer, D. Deutch, and V. Tannen. Provenance for aggregate queries. In PODS 2011, pages 153--164, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. O. Benjelloun, A. D. Sarma, A. Y. Halevy, M. Theobald, and J. Widom. Databases with uncertainty and lineage. VLDB J., 17(2):243--264, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Bhagwat, L. Chiticariu, W. C. Tan, and G. Vijayvargiya. An annotation management system for relational databases. VLDB J., 14(4):373--396, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  4. P. Buneman, S. A. Naqvi, V. Tannen, and L. Wong. Principles of programming with complex objects and collection types. Theor. Comp. Sci., 149(1):3--48, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Buneman, S. Khanna, and W.-C. Tan. Why and where: A characterization of data provenance. In ICDT 2001, number 1973 in LNCS, pages 316--330. Springer Berlin / Heidelberg, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Buneman, J. Cheney, and S. Vansummeren. On the expressiveness of implicit provenance in query and update languages. ACM Trans. Database Syst., 33(4):28:1--28:47, Dec. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Cheney, L. Chiticariu, and W.-C. Tan. Provenance in databases: Why, how, and where. Foundations and Trends in Databases, 1(4):379--474, Apr. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Cheney, A. Ahmed, and U. A. Acar. Database queries that explain their work. In PPDP 2014, pages 271--282. ACM, 2014a. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Cheney, S. Lindley, G. Radanne, and P. Wadler. Effective quotation: Relating approaches to language-integrated query. In PEPM 2014, pages 15--26. ACM, 2014b. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Cheney, S. Lindley, and P. Wadler. Query shredding: Efficient relational evaluation of queries over nested multisets. In SIGMOD 2014, pages 1027--1038. ACM, 2014c. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Chlipala. Ur/Web: A simple model for programming the web. In POPL 2015, pages 153--165. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. Cooper. The script-writer's dream: How to write great SQL in your own language, and be sure it will succeed. In DBPL 2009, volume 5708 of LNCS, pages 36--51. Springer Berlin Heidelberg, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Cooper, S. Lindley, P. Wadler, and J. Yallop. Links: Web programming without tiers. In FMCO 2006, pages 266--296. Springer-Verlag, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Cui, J. Widom, and J. L. Wiener. Tracing the lineage of view data in a warehousing environment. ACM Trans. Database Syst., 25(2):179--227, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Fehrenbach and J. Cheney. Language-integrated provenance in Links. In TaPP Workshop, July 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. N. Foster, T. J. Green, and V. Tannen. Annotated XML: queries and provenance. In PODS, pages 271--280, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Giorgidze, T. Grust, T. Schreiber, and J. Weijers. Haskell boards the ferry: Database-supported program execution for Haskell. In IFL 2010, pages 1--18. Springer-Verlag, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. Giorgidze, T. Grust, A. Ulrich, and J. Weijers. Algebraic data types for language-integrated queries. In DDFP 2013, pages 5--10. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Glavic and G. Alonso. Provenance for nested subqueries. In EDBT 2009, pages 982--993, 2009a. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. B. Glavic and G. Alonso. Perm: Processing provenance and data on the same data model through query rewriting. In ICDE 2009, pages 174--185, 2009b. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Glavic, R. Miller, and G. Alonso. Using SQL for efficient generation and querying of provenance information. In Festschrift in Honour of Peter Buneman, volume 8000 of LNCS, pages 291--320. Springer Berlin Heidelberg, 2013.Google ScholarGoogle Scholar
  22. T. J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In PODS 2007, pages 31--40. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Grust and A. Ulrich. First-class functions for first-order database engines. In DBPL 2013, 2013.Google ScholarGoogle Scholar
  24. T. Grust, J. Rittinger, and T. Schreiber. Avalanche-safe LINQ compilation. PVLDB, 3(1):162--172, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. Karvounarakis, Z. G. Ives, and V. Tannen. Querying data provenance. In SIGMOD 2010, pages 951--962, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Lindley and J. Cheney. Row-based effect types for database integration. In TLDI 2012, pages 91--102. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. E. Meijer, B. Beckman, and G. Bierman. LINQ: Reconciling object, relations and XML in the .NET framework. In SIGMOD 2006, pages 706--706. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Ohori and K. Ueno. Making Standard ML a practical database programming language. In ICFP 2011, pages 307--319. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Serrano. Hop, a fast server for the diffuse web. In COORDINATION, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. L. K. Shar and H. B. K. Tan. Defeating SQL injection. IEEE Computer, 46(3):69--77, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. Syme. Leveraging .NET meta-programming components from F#: integrated queries and interoperable heterogeneous execution. In ML Workshop, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Ulrich and T. Grust. The flatter, the better: Query compilation based on the flattening transformation. In SIGMOD 2015, pages 1421--1426. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. L. Wong. Normal forms and conservative extension properties for query languages over collection types. J. Comput. Syst. Sci., 52(3), 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    PPDP '16: Proceedings of the 18th International Symposium on Principles and Practice of Declarative Programming
    September 2016
    249 pages
    ISBN:9781450341486
    DOI:10.1145/2967973
    • Conference Chair:
    • James Cheney,
    • Program Chair:
    • Germán Vidal

    Copyright © 2016 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 5 September 2016

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    PPDP '16 Paper Acceptance Rate17of37submissions,46%Overall Acceptance Rate230of486submissions,47%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader