skip to main content
10.1145/3311790.3396636acmconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
research-article

Toward a Data Lifecycle Model for NSF Large Facilities

Authors Info & Claims
Published:26 July 2020Publication History

ABSTRACT

National Science Foundation large facilities conduct large-scale physical and natural science research. They include telescopes that survey the entire sky, gravitational wave detectors that look deep into our universe’s past, sensor-driven field sites that collect a range of biological and environmental data, and more. The Cyberinfrastructure Center for Excellence (CICoE) pilot project aims to develop a model for a center that facilitates community building, fosters knowledge sharing, and applies best practices in consulting with large facilities with regard to their cyberinfrastructure. To accomplish this goal, the pilot began an in-depth study of how large facilities manage their data during the course of their research. Large facilities are diverse and highly complex, from the types of data they capture, to the types of equipment they use, to the types of data processing and analysis they conduct, to their policies on data sharing and use. Because of this complexity, the pilot needed to find a single lens through which it could frame its growing understanding of large facilities and identify areas where it could best serve large facilities. As a result of the pilot’s research into large facilities, common themes have emerged which have enabled the creation of a data lifecycle model that successfully captures the data management practices of large facilities. This model has enabled the pilot to organize its thinking about large facilities, and frame its support and consultation efforts around the cyberinfrastructure used during lifecycle stages. This paper describes the model and discusses how it was applied to disaster recovery planning for a representative large facility—IceCube.

Skip Supplemental Material Section

Supplemental Material

3311790.3396636.mp4

mp4

207.1 MB

References

  1. Sergio Albani and David Giaretta. 2009. Long term data and knowledge preservation to guarantee access and use of the Earth science archive. In PV2018: Ensuring the Long-Term Preservation and Value Adding to Scientific and Technical Data. 1–7.Google ScholarGoogle Scholar
  2. Suzie Allard. 2012. DataONE: Facilitating eScience through collaboration. Journal of eScience Librarianship 1, 1 (2012), 4–17.Google ScholarGoogle ScholarCross RefCross Ref
  3. Mohammed El Arass, Iman Tikito, and Nissrine Souissi. 2017. Data lifecycles analysis: Towards intelligent cycle. In 2017 Intelligent Systems and Computer Vision (ISCV). IEEE, 1–8.Google ScholarGoogle Scholar
  4. Sören Auer, Lorenz Bühmann, Christian Dirschl, Orri Erling, Michael Hausenblas, Robert Isele, Jens Lehmann, Michael Martin, Pablo N. Mendes, and Bert Van Nuffelen. 2012. Managing the life-cycle of linked data with the LOD2 stack. In International Semantic Web Conference. Springer, 1–16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alex Ball. 2012. Review of data management lifecycle models. University of Bath, IDMRC.Google ScholarGoogle Scholar
  6. Jake Carlson. 2014. The use of life cycle models in developing and supporting data services. Research Data Management: Practical Strategies for Information Professionals (2014), 63–86.Google ScholarGoogle Scholar
  7. Andrew Martin Cox and Winnie Wan Ting Tam. 2018. A critical analysis of lifecycle models of the research process and research data management. Aslib Journal of Information Management 70, 2 (2018), 142–157.Google ScholarGoogle ScholarCross RefCross Ref
  8. Kevin Crowston and Jian Qin. 2011. A capability maturity model for scientific data management: Evidence from the literature. Proceedings of the American Society for Information Science and Technology 48, 1 (2011), 1–9.Google ScholarGoogle ScholarCross RefCross Ref
  9. Ewa Deelman, Anirban Mandal, Valerio Pascucci, Susan Sons, Jane Wyngaard, Charles F. Vardeman II, Steve Petruzza, Ilya Baldin, Laura Christopherson, Ryan Mitchell, Loic Pottier, Mats Rynge, Erik Scott, Karan Vahi, Marina Kogank, Jasmine A Mann, Tom Gulbransen, Daniel Allen, David Barlow, Santiago Bonarrigo, Chris Clark, Leslie Goldman, Tristan Goulden, Phil Harvey, David Hulsander, Steve Jacob, Christine Laney, Ivan Lobo-Padilla, Jeremey Sampson, John Staarmann, and Steve Stone. 2019. Cyberinfrastructure Center of Excellence Pilot: Connecting Large Facilities Cyberinfrastructure. In 15th International Conference on eScience (eScience) (San Diego, CA, USA). Funding Acknowledgments: NSF 1842042.Google ScholarGoogle Scholar
  10. Yuri Demchenko, Cees De Laat, and Peter Membrey. 2014. Defining architecture components of the Big Data Ecosystem. In 2014 International Conference on Collaboration Technologies and Systems (CTS). IEEE, 104–112.Google ScholarGoogle ScholarCross RefCross Ref
  11. DigitalNZ.org. [n.d.]. Getting Started with Digitisation. https://digitalnz.org/make-it-digital/getting-started-with-digitisationGoogle ScholarGoogle Scholar
  12. Satu Elo and Helvi Kyngäs. 2008. The qualitative content analysis process. Journal of Advanced Nursing 62, 1 (2008), 107–115.Google ScholarGoogle ScholarCross RefCross Ref
  13. John L. Faundeen, Thomas E. Burley, Jennifer A. Carlino, David L. Govoni, Heather S. Henkel, Sally L. Holl, Vivian B. Hutchison, Elizabeth Martín, Ellyn T. Montgomery, and Cassandra Ladino. 2013. The United States geological survey science data lifecycle model. Technical Report. US Geological Survey. https://pubs.usgs.gov/of/2013/1265/pdf/of2013-1265.pdfGoogle ScholarGoogle Scholar
  14. Inter-University Consortium for Political Social Research (ICPSR). 2012. Guide to Social Science Data Preparation and Archiving Best Practice Throughout the Data Life Cycle. https://www.icpsr.umich.edu/files/deposit/dataprep.pdfGoogle ScholarGoogle Scholar
  15. Sarah Higgins. 2008. The DCC curation lifecycle model. International Journal of Digital Curation 3, 1 (2008).Google ScholarGoogle ScholarCross RefCross Ref
  16. Chuck Humphrey. 2006. e-Science and the Life Cycle of Research. https://era.library.ualberta.ca/items/3334684b-fa6a-4c9d-a74b-559fecd42f9f/view/79b064d6-7b51-4d18-8e4e-3d42b9faa81f/Lifecycle-science060308.pdfGoogle ScholarGoogle Scholar
  17. Data Documentation Initiative. 2019. Why Use DDI?https://ddialliance.org/training/why-use-ddiGoogle ScholarGoogle Scholar
  18. Nawsher Khan, Ibrar Yaqoob, Ibrahim Abaker Targio Hashem, Zakira Inayat, Mahmoud Ali, Waleed Kamaleldin, Muhammad Alam, Muhammad Shiraz, and Abdullah Gani. 2014. Big data: survey, technologies, opportunities, and challenges. The Scientific World Journal 2014 (2014).Google ScholarGoogle Scholar
  19. Finance Large Facilities Office in the Budget and Award Management Office (BFA-LFO). 2019. Major Facilities Guide. NSF 19-68. National Science Foundation. https://www.nsf.gov/pubs/2019/nsf19068/nsf19068.pdfGoogle ScholarGoogle Scholar
  20. Brian Lavoie. 2000. Meeting the challenges of digital preservation: The OAIS reference model. Technical Report. Online Computer Library Center (OCLC). https://www.oclc.org/research/publications/library/2000/lavoie-oais.htmlGoogle ScholarGoogle Scholar
  21. Li Lin, Tingting Liu, Jian Hu, and Jianbiao Zhang. 2014. A privacy-aware cloud service selection method toward data life-cycle. In 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS). IEEE, 752–759.Google ScholarGoogle ScholarCross RefCross Ref
  22. Philipp Mayring. 2004. Qualitative content analysis. A Companion to Qualitative Research 1 (2004), 159–176.Google ScholarGoogle Scholar
  23. Research Information Network and NESTA. 2010. Open to all? Case studies of openness in research. http://www.rin.ac.uk/system/files/attachments/NESTA-RIN_Open_Science_V01_0.pdfGoogle ScholarGoogle Scholar
  24. University of Central Florida Libraries: Scholarly Communication. [n.d.]. Overview: Research Lifecycle. https://library.ucf.edu/about/departments/scholarly-communication/overview-research-lifecycle/Google ScholarGoogle Scholar
  25. University of Virginia Library: Research Data Services and Sciences. [n.d.]. Steps in the Data Life Cycle. https://data.library.virginia.edu/data-management/lifecycle/Google ScholarGoogle Scholar
  26. Working Group on Information Systems and Services. 2012. Data life cycle models and concepts: CEOS Version 1.2. Technical Report. Committee on Earth Observation Satellites (CEOS). http://ceos.org/document_management/Working_Groups/WGISS/Interest_Groups/Data_Stewardship/White_Papers/WGISS_DSIG_Data-Lifecycle-Models-And-Concepts-v13-1_Apr2012.docxGoogle ScholarGoogle Scholar
  27. Alberto Pepe, Matthew Mayernik, Christine L. Borgman, and Herbert Van de Sompel. 2010. From artifacts to aggregations: Modeling scientific life cycles on the semantic web. Journal of the American Society for Information Science and Technology 61, 3 (2010), 567–582.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Line Pouchard. 2015. Revisiting the data lifecycle with big data curation. International Journal of Digital Curation 10, 2 (2015), 176–192.Google ScholarGoogle ScholarCross RefCross Ref
  29. Janine Rüegg, Corinna Gries, Ben Bond-Lamberty, Gabriel J. Bowen, Benjamin S. Felzer, Nancy E. McIntyre, Patricia A. Soranno, Kristin L. Vanderbilt, and Kathleen C. Weathers. 2014. Completing the data life cycle: Using information management in macrosystems ecology research. Frontiers in Ecology and the Environment 12, 1 (2014), 24–30.Google ScholarGoogle ScholarCross RefCross Ref
  30. Amir Sinaeepourfard, Xavier Masip-Bruin, Jordi Garcia, and Eva Marín-Tordera. 2015. A survey on data lifecycle models: Discussions toward the 6Vs Challenges (UPC-DAC-RR-2015–18). Technical Report. https://www.ac.upc.edu/app/research-reports/html/RR/2015/18.pdfGoogle ScholarGoogle Scholar
  31. Carly Strasser, Robert Cook, William Michener, and Amber Budden. 2012. Primer on data management: What you always wanted to know. Technical Report. DataONE. https://www.dataone.org/sites/all/documents/DataONE_BP_Primer_020212.pdfGoogle ScholarGoogle Scholar
  32. Marianne Swanson, Pauline Bowen, Amy Phillips, Dean Gallup, and David Lynes. 2010. Contingency planning guide for federal information systems, SP 800-34 Rev.1. Technical Report. National Institute of Standards and Technology (NIST). https://csrc.nist.gov/publications/detail/sp/800-34/rev-1/finalGoogle ScholarGoogle Scholar
  33. Barbara M. Wildemuth. 2009. Applications of Social Research Methods to Questions in Information and Library Science. Libraries Unlimited.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    PEARC '20: Practice and Experience in Advanced Research Computing
    July 2020
    556 pages
    ISBN:9781450366892
    DOI:10.1145/3311790

    Copyright © 2020 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 26 July 2020

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate133of202submissions,66%

    Upcoming Conference

    PEARC '24

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format