skip to main content
column

Provenance as first class cloud data

Published:27 January 2010Publication History
Skip Abstract Section

Abstract

Digital provenance is meta-data that describes the ancestry or history of a digital object. Most work on provenance focuses on how provenance increases the value of data to consumers. However, provenance is also valuable to storage providers. For example, provenance can provide hints on access patterns, detect anomalous behavior, and provide enhanced user search capabilities. As the next generation storage providers, cloud vendors are in the unique position to capitalize on this opportunity to incorporate provenance as a fundamental storage system primitive. To date, cloud offerings have not yet done so. We provide motivation for providers to treat provenance as first class data in the cloud and based on our experience with provenance in a local storage system, suggest a set of requirements that make provenance feasible and attractive.

References

  1. Pubchem. http://pubchem.ncbi.nlm.nih.gov/.Google ScholarGoogle Scholar
  2. Genbank. Nucleic Acids Research, 36 (Database Issue), January 2008.Google ScholarGoogle Scholar
  3. I. Adams, D.D.E. Long, E.L. Miller, S. Pasupathy, and M.W. Storer. Maximizing efficiency by trading storage for computation. 2009.Google ScholarGoogle Scholar
  4. U. Braun, A. Shinnar, and M. Seltzer. Securing Provenance. In Proceedings of HotSec 2008, July 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Buneman, S. Khanna, and W. Tan. Why and Where: A Characterization of Data Provenance. In International Conference on Database Theory, London, UK, Jan. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. Gruber. Bigtable: A distributed storage system for structured data. In 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Dagdigian. Plenery Keynote: Bio.IT World. http://blog.bioteam.net/wp-content/uploads/2009/04/bioitworld-2009-keynote-cdagdigian.pdf.Google ScholarGoogle Scholar
  8. J. Griffioen and R. Appleton. Reducing file system latency using a predictive approach. In USTC'94: Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference, pages 13--13, Berkeley, CA, USA, 1994. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Hasan, R. Sion, and M. Winslett. The Case of the Fake Picasso: Preventing History Forgery with Secure Provenance. In FAST, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D.A. Holland, U. Braun, D. Maclean, K.-K. Muniswamy-Reddy, and M.I. Seltzer. A Data Model and Query Language Suitable for Provenance. In Proceedings of the 2008 International Provenance and Annotation Workshop (IPAW).Google ScholarGoogle Scholar
  11. Nirvanix internet media file system. http://developer.nirvanix.com/sitefiles/1000/API.html.Google ScholarGoogle Scholar
  12. S.T. King, Z.M. Mao, D.G. Lucchetti, and P.M. Chen. Enriching intrusion alerts through multi-host causality. In the 12th Annual Network and Distributed System Security Symposium, 2005.Google ScholarGoogle Scholar
  13. T.M. Kroeger and D.D.E. Long. Predicting file system actions from prior events. In ATEC '96: Proceedings of the 1996 annual conference on USENIX Annual Technical Conference, pages 26--26, Berkeley, CA, USA, 1996. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G.H. Kuenning. The design of the seer predictive caching system. In In Proceedings of the Workshop on Mobile Computing Systems and Applications, pages 37--43, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. The LINQ project. http://msdn.microsoft.com/en-us/vcsharp/aa904594.aspx.Google ScholarGoogle Scholar
  16. L. Moreau, B. Plale, S. Miles, C. Goble, P. Missier, R. Barga, Y. Simmhan, J. Futrelle, R.E. McGrath, J. Myers, P. Paulson, S. Bowers, B. Ludaescher, N. Kwasnikowska, J.V. den Bussche, T. Ellkvist, and J.F.P. Groth. The open provenance model (v1.01). http://eprints.ecs.soton.ac.uk/16148/1/opm-v1.01.pdf.Google ScholarGoogle Scholar
  17. K.-K. Muniswamy-Reddy, U. Braun, D.A. Holland, P. Macko, D. Maclean, D. Margo, M. Seltzer, and R. Smogor. Layering in provenance systems. In Proceedings of the 2009 USENIX Annual Technical Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K.-K. Muniswamy-Reddy, D.A. Holland, U. Braun, and M. Seltzer. Provenance-aware storage systems. In Proceedings of the 2006 USENIX Annual Technical Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K.-K. Muniswamy-Reddy, P. Macko, and M. Seltzer. Making a cloud provenance-aware. In 1st Workshop on the Theory and Practice of Provenance, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Data Dictionary for Preservation Metadata. http://www.oclc.org/research/projects/pmwg/premis-final.pdf, May 2005.Google ScholarGoogle Scholar
  21. Amazon Simple Storage Service (Amazon S3). http://aws.amazon.com/s3.Google ScholarGoogle Scholar
  22. Amazon SimpleDB. http://aws.amazon.com/simpledb.Google ScholarGoogle Scholar
  23. S. Shah, C.A.N. Soules, G.R. Ganger, and B.D. Noble. Using provenance to aid in personal file search. In Proceedings of the USENIX Annual Technical Conference, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Somayaji and S. Forrest. Automated Response Using System-Call Delays. In USENIX Security Symposium, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Widom. Trio: A system for data, uncertainty, and lineage. In Managing and Mining Uncertain Data. Springer, 2008.Google ScholarGoogle Scholar

Index Terms

  1. Provenance as first class cloud data
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGOPS Operating Systems Review
      ACM SIGOPS Operating Systems Review  Volume 43, Issue 4
      January 2010
      105 pages
      ISSN:0163-5980
      DOI:10.1145/1713254
      Issue’s Table of Contents

      Copyright © 2010 Authors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 January 2010

      Check for updates

      Qualifiers

      • column

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader