Model-based Geostatistics

verfasst von: Peter J. Diggle, Paulo J. Ribeiro Jr.

Verlag: Springer New York

Buchreihe : Springer Series in Statistics

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

Geostatistics is concerned with estimation and prediction problems for spatially continuous phenomena, using data obtained at a limited number of spatial locations. The name reflects its origins in mineral exploration, but the methods are now used in a wide range of settings including public health and the physical and environmental sciences. Model-based geostatistics refers to the application of general statistical principles of modeling and inference to geostatistical problems. This volume is the first book-length treatment of model-based geostatistics.

The authors have written an expository text, emphasizing statistical methods and applications rather than the underlying mathematical theory. Analyses of datasets from a range of scientific contexts feature prominently, and simulations are used to illustrate theoretical results. Readers can reproduce most of the computational results in the book by using the authors' R-based software package, geoR, whose usage is illustrated in a computation section at the end of each chapter.

The book assumes a working knowledge of classical and Bayesian methods of inference, linear models, and generalized linear models, but does not require previous exposure to spatial statistical models or methods. The authors have used the material in MSc-level statistics courses.

Inhaltsverzeichnis

Frontmatter

1. Introduction

Abstract

The term spatial statistics is used to describe a wide range of statistical models and methods intended for the analysis of spatially referenced data. Cressie (1993) provides a general overview. Within spatial statistics, the term geostatistics refers to models and methods for data with the following characteristics. Firstly, values Y_i: i = 1, ..., n are observed at a discrete set of sampling locations x_i within some spatial region A. Secondly, each observed value Y_i is either a direct measurement of, or is statistically related to, the value of an underlying continuous spatial phenomenon, S(x), at the corresponding sampling location x_i. This rather abstract formulation can be translated to a variety of more tangible scientific settings, as the following examples demonstrate.

2. An overview of model-based geostatistics

Abstract

The aim of this chapter is to provide a short overview of model-based geostatistics, using the elevation data of Example 1.1 to motivate the various stages in the analysis. Although this example is very limited from a scientific point of view, its simplicity makes it well suited to the task in hand. Note, however, that Handcock and Stein (1993) show how to construct a useful explanatory variable for these data using a map of streams which run through the study region.

3. Gaussian models for geostatistical data

Abstract

Gaussian stochastic processes are widely used in practice as models for geostatistical data. These models rarely have any physical justification. Rather, they are used as convenient empirical models which can capture a wide range of spatial behaviour according to the specification of their correlation structure. Historically, one very good reason for concentrating on Gaussian models was that they are uniquely tractable as models for dependent data. With the increasing use of computationally intensive methods, and in particular of simulation-based methods of inference, the analytic tractability of Gaussian models is becoming a less compelling reason to use them. Nevertheless, it is still convenient to work within a standard model class in routine applications. The scope of the Gaussian model class can be extended by using a transformation of the original response variable, and with this extra flexibility the model often provides a good empirical fit to data. Also, within the specific context of geostatistics, the Gaussian assumption is the model-based counterpart of some widely used geostatistical prediction methods, including simple, ordinary and universal kriging (Journel and Huijbregts, 1978; Chilès and Delfiner, 1999). We shall use the Gaussian model initially as a model in its own right for geostatistical data with a continuously varying response, and later as an important component of a hierarchically specified generalised linear model for geostatistical data with a discrete response variable, as previously discussed in Section 1.4.

4. Generalized linear models for geostatistical data

Abstract

In the classical setting of independently replicated data, the generalized linear model (GLM) as introduced by Nelder and Wedderburn (1972) provides a unifying framework for regression modelling of continuous or discrete data. The original formulation has since been extended, in various ways, to accommodate dependent data. In this chapter we enlarge on the brief discussion of Section 1.4 to consider extensions of the classical GLM which are suitable for geostatistical applications.

5. Classical parameter estimation

Abstract

In this chapter, we discuss methods for formulating a suitable geostatistical model and estimating its parameters. We use the description “classical” in two different senses: firstly, as a reference to the variogram-based methods of estimation which are widely used in classical geostatistics as developed by the Fontainebleau school; secondly, within mainstream statistical methodology as a synonym for non-Bayesian. The chapter has a strong focus on the linear Gaussian model. This is partly because the Gaussian model is, from our perspective, implicit in much of classical geostatistical methodology, and partly because model-based estimation methods are most easily implemented in the linear Gaussian case. We discuss non-Bayesian estimation for generalized linear geostatistical models in Section 5.5, indicating in particular why maximum likelihood estimation is feasible in principle, but difficult to implement in practice.

6. Spatial prediction

Abstract

In this chapter, we consider the problem of using the available data to predict aspects of the realised, but unobserved, signal S(·). More formally, our target for prediction is the realised value of a random variable T = T (S), where S denotes the complete set of realised values of S(x) as x varies over the spatial region of interest, A. The simplest example of this general problem is to predict the value of the signal, T = S(x), at an arbitrary location x, using observed data Y = (Y₁, ..., Y_n), where each Y_i represents a possibly noisy version of the corresponding S(x_i). Other common targets T include the integral of S(x) over a prescribed sub-region of A or, more challengingly, a non-linear functional such as the maximum of S(x), or the set of locations for which S(x) exceeds some prescribed value. In this chapter, we ignore the problem of parameter estimation, in effect treating all model parameters as known quantities.

7. Bayesian inference

Abstract

In Chapters 5 and 6 we discussed geostatistical inference from a classical or non-Bayesian perspective, treating parameter estimation and prediction as separate problems. We did this for two reasons, one philosophical the other practical. Firstly, in the non-Bayesian setting, there is a fundamental distinction between a parameter and a prediction target. A parameter has a fixed, but unknown value which represents a property of the processes which generate the data, whereas a prediction target is the realised value of a random variable associated with those same processes. Secondly, estimation and prediction are usually operationally separate in geostatistical practice, meaning that we first formulate our model and estimate its parameters, then plug the estimated parameter values into theoretical prediction equations as if they were the true values. An obvious concern with this two-phase approach is that ignoring uncertainty in the parameter estimates may lead to optimistic assessments of predictive accuracy. It is possible to address this concern in various ways without being Bayesian, but in our view the Bayesian approach gives a more elegant solution, and it is the one which we have adopted in our own work.

8. Geostatistical design

Abstract

In this chapter, we consider the specific design problem of where to locate the sample points x_i: i = 1, ..., n. In particular applications other design issues, such as what to measure at each location, what covariates to record and so forth, may be at least as important as the location of the sample points. But questions of this kind can only be addressed in specific contexts, whereas the sample-location problem can be treated generically.

Backmatter

Titel: Model-based Geostatistics
verfasst von: Peter J. Diggle
Paulo J. Ribeiro Jr.
Verlag: Springer New York
Electronic ISBN: 978-0-387-48536-2
Print ISBN: 978-0-387-32907-9
DOI: https://doi.org/10.1007/978-0-387-48536-2