skip to main content
10.1145/1265530.1265535acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article

Provenance semirings

Published:11 June 2007Publication History

ABSTRACT

We show that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and why-provenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation that uses semirings of polynomials. We extend these considerations to datalog and semirings of formal power series. We give algorithms for datalog provenance calculation as well as datalog evaluation for incomplete and probabilistic databases. Finally, we show that for some semirings containment of conjunctive queries is the same as for standard set semantics.

Skip Supplemental Material Section

Supplemental Material

p31-green_56k.mp4

mp4

28.4 MB

p31-green_768k.mp4

mp4

169.4 MB

References

  1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. F. N. Afrati and C. H. Papadimitriou. The parallel complexity of simple logic programs. J. ACM, 40(4):891--916, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. O. Benjelloun, A. D. Sarma, A. Y. Halevy, and J. Widom. ULDBs: Databases with uncertainty and lineage. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Bistarelli, U. Montanari, and F. Rossi. Semiring-based constraint satisfaction and optimization. JACM, 44(2):201--236, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Bistarelli, U. Montanari, and F. Rossi. Semiring-based constraint logic programming: syntax and semantics. ACM TOPLAS, 23(1):1--29, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Buneman, S. Khanna, and W. -C. Tan. Why and where: A characterization of data provenance. In ICDT, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. K. Chandra and P. M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In STOC, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Chiticariu and W.-C. Tan. Debugging schema mappings with routes. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Chomsky and M. -P. Schützenberger. The algebraic theory of context-free languages. Computer Programming and Formal Systems, pages 118--161, 1963.Google ScholarGoogle ScholarCross RefCross Ref
  10. M. P. Consens and A. O. Mendelzon. Low complexity aggregation in graphlog and datalog. In ICDT, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Crawley and R. P. Dilworth. Algebraic Theory of Lattices. Prentice Hall, 1973.Google ScholarGoogle Scholar
  12. Y. Cui. Lineage Tracing in Data Warehouses. PhD thesis, Stanford University, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Cui, J. Widom, and J. L. Wiener. Tracing the lineage of view data in a warehousing environment. TODS, 25(2), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. In VLDB, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Dong and L. V. S. Lakshmanan. Deductive databases with incomplete information. In Symposium on Logic Programming, 1992.Google ScholarGoogle Scholar
  16. N. Fuhr. Probabilistic datalog -- a logic for powerful retrieval methods. In SIGIR, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Fuhr and T. Rölleke. A probabilistic relational algebra for the integration of information retrieval and database systems. TOIS, 14(1), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. J. Green and V. Tannen. Models for incomplete and probabilistic information. In EDBT Workshops, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Imielinski and W. Lipski. Incomplete information in relational databases. JACM, 31(4), 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. E. Ioannidis and R. Ramakrishnan. Containment of conjunctive queries: beyond relations as sets. TODS, 20(3), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Kolaitis and M. Vardi. Conjunctive-query containment and constraint satisfaction. In PODS, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. W. Kuich. Semirings and formal power series. In Handbook of formal languages, volume 1. Springer, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. V. S. Lakshmanan, N. Leone, R. Ross, and V. S. Subrahmanian. Probview: a flexible probabilistic database system. TODS, 22(3), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. L. V. S. Lakshmanan and F. Sadri. Probabilistic deductive databases. In Symposium on Logic Programming, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Maher and R. Ramakrishnan. Déjàvu in fixpoints of logic programs. In NACLP, 1989.Google ScholarGoogle Scholar
  26. I. S. Mumick. Query Optimization in Deductive and Relational Databases. PhD thesis, Stanford University, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. I. S. Mumick, H. Pirahesh, and R. Ramakrishnan. The magic of duplicates and aggregates. In VLDB Journal, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. I. S. Mumick and O. Shmueli. Finiteness properties of database queries. In Fourth Australian Database Conference, 1993.Google ScholarGoogle Scholar
  29. Y. Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and difference operators. J. ACM, 27(4), 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. D. Sarma, O. Benjelloun, A. Halevy, and J. Widom. Working models for uncertain data. In ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. D. Ullman. Principles of Database and Knowledge-Base Systems, volume II. Computer Science Press, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. L. A. Zadeh. Fuzzy sets. Inf. Control, 8(3), 1965.Google ScholarGoogle Scholar
  33. E. Zimányi. Query evaluation in probabilistic relational databases. TCS, 171(1-2), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Provenance semirings

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PODS '07: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
      June 2007
      328 pages
      ISBN:9781595936851
      DOI:10.1145/1265530

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 June 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      PODS '07 Paper Acceptance Rate28of187submissions,15%Overall Acceptance Rate642of2,707submissions,24%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader