skip to main content
article

Scientific data management in the coming decade

Published:01 December 2005Publication History
Skip Abstract Section

Abstract

Scientific instruments and computer simulations are creating vast data stores that require new scientific methods to analyze and organize the data. Data volumes are approximately doubling each year. Since these new instruments have extraordinary precision, the data quality is also rapidly improving. Analyzing this data to find the subtle effects missed by previous studies requires algorithms that can simultaneously deal with huge datasets and that can find very subtle effects --- finding both needles in the haystack and finding very small haystacks that were undetected in previous measurements.

References

  1. {fr1} Committee on Data Management, Archiving, and Computing (CODMAC) Data Level Definitions http://science.hq.nasa.gov/research/earth_science_formats.htmlGoogle ScholarGoogle Scholar
  2. {fr2} http://hdf.ncsa.uiuc.edu/HDF5/Google ScholarGoogle Scholar
  3. {fr3} http://my.unidata.ucar.edu/content/software/netcdf/Google ScholarGoogle Scholar
  4. {fr4} http://fits.gsfc.nasa.gov/Google ScholarGoogle Scholar
  5. {fr5} http://vizier.u-strasbg.fr/doc/UCD.htxGoogle ScholarGoogle Scholar
  6. {fr6} "MapReduce: Simplified Data Processing on Large Clusters," J. Dean, S. Ghemawat, ACM OSDI, Dec. 2004.Google ScholarGoogle Scholar
  7. {fr7} "Parallel Database Systems: the Future of High Performance Database Systems", D. DeWitt, J. Gray, CACM, Vol. 35, No. 6, June 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. {fr8} "When Database Systems Meet the Grid," M. Nieto Santisteban et. al., CIDR, 2005, http://www-db.cs.wisc.edu/cidr/papers/P13.pdfGoogle ScholarGoogle Scholar
  9. {fr9} "Batch is back: CasJobs serving multi-TB data on the Web," W. O'Mullane, et. al, in preparation.Google ScholarGoogle Scholar
  10. {fr10} "Lessons Learned from Managing a Petabyte," J. Becla and D. L. Wang, CIDR, 2005, http://www-db.cs.wisc.edu/cidr/papers/P06.pdfGoogle ScholarGoogle Scholar
  11. {fr11} D. T. Liu and M. J. Franklin, VLDB, 2004, www.cs.berkeley.edu/~dtliu/pubs/griddb_vldb04. pdfGoogle ScholarGoogle Scholar
  12. {fr12} M. Litzkow, M. Livny and M. Mutka, Condor - A Hunter of Idle Workstations, International Conference of Distributed Computing Systems, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  13. {fr13} I. Foster and C. Kesselman, Globus: A Metacomputing Infrastructure Toolkit, Journal of Supercomputer Applications and High Performance Computing, 1997.Google ScholarGoogle Scholar

Index Terms

  1. Scientific data management in the coming decade

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM SIGMOD Record
            ACM SIGMOD Record  Volume 34, Issue 4
            December 2005
            86 pages
            ISSN:0163-5808
            DOI:10.1145/1107499
            Issue’s Table of Contents

            Copyright © 2005 Authors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 December 2005

            Check for updates

            Qualifiers

            • article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader