ABSTRACT
New technologies for scientific research are producing a deluge of data that is overwhelming traditional tools for data capture, analysis, storage, and access. We report on a study of scientific practices associated with dynamic deployments of embedded sensor networks to identify requirements for data digital libraries. As part of continuing research on scientific data management, we interviewed 22 participants in 5 environmental science projects to identify data types and uses, stages in their data life cycle, and requirements for digital library architecture. We found that scientists need continuous access to their data from the time that field experiments are designed through final analysis and publication, thus reflecting a broader notion of "digital library." Six categories of requirements are discussed: the ability to obtain and maintain data in the field, verify data in the field, document data context for subsequent interpretation, integrate data from multiple sources, analyze data, and preserve data. Three digital library efforts currently underway within the Center for Embedded Networked Sensing are addressing these requirements, with the goal of a tightly coupled interoperable framework that, in turn, will be a component of cyberinfrastructure for science.
- Hey, T. and A. Trefethen, The Data Deluge: An e-Science Perspective, in Grid Computing-Making the Global Infrastructure a Reality. 2003, Wiley. Retrieved from http://www.rcuk.ac.uk/escience/documents/report_datadeluge.pdf on 20 January 2005.Google Scholar
- Price, D. J. d. S., Little Science, Big Science. 1963, New York: Columbia University PressGoogle Scholar
- Elson, J. and D. Estrin, Sensor networks: A bridge to the physical world, in Wireless sensor networks, C. S. Raghavendra, K. M. Sivalingam, and T. F. Znati, Editors. 2004, Kluwer Academic: Boston. Google ScholarDigital Library
- Pottie, G. J. and W. J. Kaiser, Principles of embedded networked systems design. 2006, Cambridge, England: Cambridge University Press. Google ScholarDigital Library
- Borgman, C. L., et al., Social Aspects of Digital Libraries. Final Report to the National Science Foundation; Computer, Information Science, and Engineering Directorate; Division of Information, Robotics, and Intelligent Systems; Information Technology and Organizations Program. 1996. Retrieved from http://is.gseis.ucla.edu/research/dl/index.html on 28 September 2006.Google Scholar
- Arzberger, P., et al., An International Framework to Promote Access to Data. Science, 2004. 303(5665): p. 1777--1778.Google Scholar
- Borgman, C. L., Scholarship in the Digital Age: Information, Infrastructure, and the Internet. 2007, Cambridge, MA: MIT Press. Google ScholarDigital Library
- Hilgartner, S. and S. I. Brandt-Rauf, Data access, ownership and control: Toward empirical studies of access practices. Knowledge, 1994. 15: p. 355--372.Google Scholar
- Borgman, C. L., J. C. Wallis, and N. Enyedy, Little Science Confronts the Data Deluge: Habitat Ecology, Embedded Sensor Networks, and Digital Libraries. International Journal on Digital Libraries, in press. Google ScholarDigital Library
- Borgman, C. L., J. C. Wallis, and N. Enyedy, Building digital libraries for scientific data: An exploratory study of data practices in habitat ecology. 10th European Conference on Digital Libraries, Alicante, Spain, 2006. Berlin: Springer. LINCS 4172: 170--183. Google ScholarDigital Library
- Shankar, K., Scientific data archiving: the state of the art in information, data, and metadata management. 2003. Retrieved from http://cens.ucla.edu/Education/index.html on 19 January 2005.Google Scholar
- Ecological Metadata Language. 2004. Retrieved from http://knb.ecoinformatics.org/software/eml/ on 25 November 2004.Google Scholar
- Knowledge Network for Biocomplexity. 2004. Retrieved from http://knb.ecoinformatics.org/index.jsp on 25 November 2004.Google Scholar
- Botts, M. Sensor Modeling Language (SensorML) Status. 2006. Retrieved from http://stromboli.nsstc.uah.edu/SensorML/status.html on 20 November 2006.Google Scholar
- Bishop, A.P ., N. Van House, and B. P. Buttenfield, eds. Digital library use: Social practice in design and evaluation. 2003, MIT Press: Cambridge, MA. Google ScholarDigital Library
- Bowker, G. C., Memory Practices in the Sciences. 2005, Cambridge, MA: MIT Press.Google Scholar
- U.S. Long Term Ecological Research Network. 2006. Retrieved from http://lternet.edu/ on 5 June 2006.Google Scholar
- Consortium of Universities for Advancement of Hydrologic Science. 2006. Retrieved from http://www.cuahsi.org on 15 November 2006.Google Scholar
- Collaborative Large-Scale Engineering Analysis Network for Environmental Research. 2006. Retrieved from http://cleaner.ncsa.uiuc.edu/home/ on 16 August 2006.Google Scholar
- Lofland, J., et al., Analyzing Social Settings: A Guide to Qualitative Observation and Analysis. 2006, Belmont, CA: Wadsworth/Thomson Learning.Google Scholar
- National Ecological Observatory Network. 2006. Retrieved from http://neoninc.org/ on 3 October 2006.Google Scholar
- Real-time Observatories, Applications, and Data Management Network. 2007. Retrieved from Http://roadnet.ucsd.edu on 3 April 2007.Google Scholar
- Glaser, B. G. and A. L. Strauss, The discovery of grounded theory; strategies for qualitative research. Observations. 1967, Chicago,: Aldine Pub. Co. x, 271.Google Scholar
- Open Archives Initiative. 2007. Retrieved from http://www.openarchives.org/ore/ on 4 February 2007.Google Scholar
- Pepe, A., C. L. Borgman, M. Mayernik, and J. C. Wallis, Knitting a Fabric of Sensor Data Resources. International Conference on Information Processing in Sensor Networks, Cambridge, Massachusetts, 2007.Google Scholar
Index Terms
- Drowning in data: digital library architecture to support scientific use of embedded sensor networks
Recommendations
Data, data use, and scientific inquiry: two case studies of data practices
JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital LibrariesData are proliferating far faster than they can be captured, managed, or stored. What types of data are most likely to be used and reused, by whom, and for what purposes? Answers to these questions will inform information policy and the design of ...
A Proposal for a Reference Architecture for Long-Term Archiving, Preservation, and Retrieval of Big Data
TRUSTCOM '14: Proceedings of the 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and CommunicationsThe volume of data stored in corporate data centers has been growing at a rate of 35% to 50% per year [1]. The exponential growth in data volume leads to some challenges from the technical, operational and financial perspectives. Along with this ...
Recovering and Reusing Historical Data for Science: Retrospective Curation Practices Across Disciplines
Information for a Better World: Normality, Virtuality, Physicality, InclusivityAbstractWhile data curation research and practice have provided a growing body of guidance for and tools to support the curation, sharing, and reuse of recent and future scientific data, attention to retrospective data curation has been limited. The ...
Comments