Weitere Kapitel dieses Buchs durch Wischen aufrufen
Observational measurements and model output data acquired or generated by the various research areas within the realm of Geosciences (also known as Earth Science) encompass a spatial scale of tens of thousands of kilometers and temporal scales of seconds to millions of years. Here geosciences refers to the study of atmosphere, hydrosphere, oceans, and biosphere as well as the earth’s core. Rapid advances in sensor deployments, computational capacity, and data storage density have been resulted in dramatic increases in the volume and complexity of data in geosciences. Geoscientists now see the data-intensive computing approach as part of their knowledge discovery process alongside traditional theoretical, experimental, and computational archetype . Data-intensive computing poses unique challenges to the geoscience community that is exacerbated by the sheer size of the datasets involved.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
T. Hey, et al., The Fourth Paradigm: Data-Intensive Scientific Discovery. Redmond, Washington: Microsoft Corporation, 2009.
F. M. Hoffman, et al., “Multivariate Spatio-Temporal Clustering (MSTC) as a data mining tool for environmental applications,” in the iEMSs Fourth Biennial Meeting: International Congress on Environmental Modelling and Software Society (iEMSs 2008), 2008, pp. 1774–1781.
F. M. Hoffman, et al., “Data Mining in Earth System,” in the International Conference on Computational Science (ICCS), 2011, pp. 1450–1455.
O. J. Reichman, et al. (2011) Challenges and opportunities of open data in ecology. Science. 703–705.
M. Keller, et al., “A continental strategy for the National Ecological Observatory Network,” Front. Ecol. Environ Special Issue on Continental-Scale Ecology, vol. 5, pp. 282–284, 2008.
D. Schimel, et al., “NEON: A hierarchically designed national ecological network,” Front. Ecol. Environ, vol. 2, 2007.
G. Percivall and C. Reed, “OGC Sensor Web Enabliment Standards,” Sensors and Transducers Journal, vol. 71, pp. 698–706, 2006.
MTPE EOS Reference Handbook the EOS Project Science Office, code 900, NASA Goddard Space Flight Center, 1995.
The Global Telecommunication System. Available: http://www.wmo.int/pages/prog/www/TEM/GTS/index_en.html
National Center for Environmental Prediction (NCEP). Available: http://www.ncep.noaa.gov/
Panasas: Parallel File System for HPC Storage. Available: http://www.panasas.com/
M. M. Kuhn, et al., “Dynamic file system semantics to enable metadata optimizations in PVFS,” Concurrency and Computation: Practice and Experience, vol. 21, 2009.
P. J. Braam, “Lustre: a scalable high-performance file system,” 2002.
F. B. Schmuck and R. L. Haskin, “GPFS: A Shared-Disk File System for Large Computing Clusters,” in the Conference on File and Storage Technologies, 2002, pp. 231–244.
J. Lofstead, et al., “Managing Variability in the IO Performance of Petascale Storage Systems,” presented at the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2010.
M. P. I. Forum, “MPI-2: Extensions to the Message-Passing Interface,” 1997.
S. Ghemawat, et al., “The Google File System,” ACM SIGOPS Operating Systems Review, vol. 37, 2003.
HDF-5. Available: http://hdf.ncsa.uiuc.edu/products/hdf5/
NetCDF4. Available: http://www.hdfgroup.org/projects/netcdf-4/.
J. Li, et al., “Parallel netCDF: A high-performance scientific I/O interface,” in ACM Supercomputing (SC03), 2003.
H. Abbasi, et al., “DataStager: scalable data staging services for petascale applications,” in ACM international Symposium on High Performance Distributed Computing, 2009.
J. Craig Upson, et al., “The Application Visualization System: A computational environment for scientific visualization,” IEEE Computer Graphics and Applications, pp. 30–42, 1989.
VisIt Visualization Tool. Available: https://wci.llnl.gov/codes/visit/home.html
R. Daley, Atmospheric Data Analysis: Cambridge atmospheric and space science series, 1993.
O. Wildi, Data Analysis in Vegetation Ecology Willey, 2010.
P. Rigaux, et al., Spatial Databases with Application to GIS: Morgan Kaufmann, 2002.
S. Shekhar and S. Chawla, Spatial Database: A Tour: Prentice Hall, 2002.
P. Longley, et al., Geographic Information Systems and Science, 3 ed.: John Wiley & Sons, 2011.
R. Rew and G. Davis, “NetCDF: an interface for scientific data access,” IEEE Computer Graphics and Applications, vol. 10, pp. 76–82, 1990. CrossRef
Common Data Model. Available: http://www.unidata.ucar.edu/software/netcdf-java/CDM/
P. Cudre-Mauroux, et al., “A Demonstration of SciDB: A Science-Oriented DBMS,” in the 2009 VLDB Endowment 2009.
J. Buck, et al., “SciHadoop: Array-based Query Processing in Hadoop,” UCSC2011.
(2010, The HDF Group. Hierarchical data format version 5. http://www.hdfgroup.org/HDF5.
(2011, FITS Support Office. http://fits.gsfc.nasa.gov/.
D. C. Wells, et al., “FITS: A Flexible Image Transport System,” Astronomy & Astrophysics, vol. 44, pp. 363–370, 1981.
P. Cornillon, et al., “OPeNDAP: Accessing data in a distributed, heterogeneous environment,” Data Science Journal, vol. 2, pp. 164–174, 2003. CrossRef
D. M. Karl, et al., “Building the long-term picture: U.S. JGOFS Time-series Programs,” Oceanography, pp. 6–17, 2001.
P. Ramsey, “PostGIS Manual,” ed: Refractions Research.
A. Guttman, “R-trees: a dynamic index structure for spatial searching,” in Proceedings of the 1984 ACM SIGMOD international conference on Management of data, ed. Boston, Massachusetts: ACM, 1984, pp. 47–57.
S. Tilak, et al., “The Ring Buffer Network Bus (RBNB) DataTurbine Streaming Data Middleware for Environmental Observing Systems,” in IEEE e-Science, 2007, pp. 125–133.
D. N. Williams, et al., “The Earth System Grid: Enabling Access to Multi-Model Climate Simulation Data,” Bulletin of the American Meteorological Society, vol. 90, pp. 195–205, 2009. CrossRef
B. Domenico, et al., “Thematic Real-time Environmental Distributed Data Services (THREDDS): Incorporating Interactive Analysis Tools into NSDL,” Journal of Interactivity in Digital Libraries, vol. 2, 2002.
A. Shoshani, et al., “Storage Resource Managers (SRM) in the Earth System Grid,” Earth System Grid2009.
G. Khanna, et al., “A Dynamic Scheduling Approach for Coordinated Wide-Area Data Transfers using GridFTP,” in the 22nd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2008), 2008.
P. G. Brown, “Overview of sciDB: large scale array storage, processing and analysis,” in Proceedings of the 2010 international conference on Management of data, ed. Indianapolis, Indiana, USA: ACM, 2010, pp. 963–968.
M. S. Mit, et al. (2009, Requirements for Science Data Bases and SciDB.
J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on large clusters,” Communications of the ACM, vol. 51, pp. 107–113, 2008. CrossRef
A. Akdogan, et al., “Voronoi-Based Geospatial Query Processing with MapReduce,” in Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on, ed, 2010, pp. 9–16.
Y. Wang and S. Wang, “Research and implementation on spatial data storage and operation based on Hadoop platform,” in Geoscience and Remote Sensing (IITA-GRS), 2010 Second IITA International Conference on vol. 2, ed, 2010, pp. 275–278.
Apache Hadoop. Available: http://hadoop.apache.org/
Hadoop Distributed File System. Available: http://hadoop.apache.org/hdfs/
J. Wang, et al., “Kepler + Hadoop: a general architecture facilitating data-intensive applications in scientific workflow systems,” in Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, ed. Portland, Oregon: ACM, 2009, pp. 12:1–12:8.
- On the Processing of Extreme Scale Datasets in the Geosciences
Sangmi Lee Pallickara
- Springer New York
- Chapter 20
Neuer Inhalt/© ITandMEDIA