Elsevier

Remote Sensing of Environment

Volume 127, December 2012, Pages 237-246
Remote Sensing of Environment

Spatial analysis of remote sensing image classification accuracy

https://doi.org/10.1016/j.rse.2012.09.005Get rights and content

Abstract

The error matrix is the most common way of expressing the accuracy of remote sensing image classifications, such as land cover. However, it and the measures that can be calculated from it have been criticised for not providing any indication of the spatial distribution of errors. Other research has identified the need for methods to analyse the spatial non-stationarity of error and to visualise the spatial variation in classification uncertainty. This research uses geographically weighted approaches to model the spatial variations in the accuracy of both (crisp) Boolean and (soft) fuzzy land cover classes. Remotely sensed data were classified using a maximum likelihood classifier and a fuzzy classifier to predict Boolean and fuzzy land cover classes respectively. Field data were collected at sub-pixel locations and used to generate soft and crisp validation data. A Geographically Weighted Regression was used to analyse spatial variations in the relationships between observations of Boolean land cover in the field and land cover classified from remote sensing imagery. A geographically weighted difference measure was used to analyse spatial variations in fuzzy land cover accuracy. Maps of the spatial distribution of accuracy were created for fuzzy and Boolean classes. This research demonstrates that data collected as part of a standard remote sensing validation exercise can be used to estimate mapped, spatial distributions of accuracy that would augment standard accuracy measures reported in the error matrix. It suggests that geographically weighted approaches, and the spatially explicit representations of accuracy they support, offer the opportunity to report land cover accuracy in a more informative way.

Highlights

► The confusion matrix provides no information on the spatial distribution of errors. ► The spatial distribution of correspondence provides richer accuracy information. ► Geographically weighted models were used to map Boolean and Fuzzy accuracy. ► This is a methodological advance in accuracy assessment in remote sensing.

Introduction

Land cover information can be generated through the classification of remotely sensed data. Areas or pixels with similar spectral characteristics are allocated to classes or categories each of which represents a different type of land cover feature. It is a process of generalisation, and involves a number of choices about image type, resolution, number and types of classes, training sites, etc. (Campbell, 2007). Assessing map accuracy in an objective manner is fundamental to most land cover mapping projects (Foody, 2002, Strahler et al., 2006). The accepted paradigm for doing this is through comparison with some alternative data in order to determine measures of accuracy which “express the degree of ‘correctness’ of a map or classification” (Foody, 2002, p186). Determining the accuracy of land cover classified from remotely sensed data is important. Land cover is an input into environmental models incorporating land-atmosphere interactions (GLP, 2005) and land cover change is a major variable in climate change analyses (Feddema et al., 2005). In this context, accuracy descriptions can help the user to assess the uncertainties associated with incorporating land cover data into their model or to decide between land cover datasets, especially where there is a choice between data with different thematic or spatial characteristics (See & Fritz, 2006). Thus accuracy is one of the key aspects of any remotely sensed data product.

The most common approach for assessing thematic map accuracy is to compare the classified land cover with alternative but spatially and temporally coincident data, which are considered to be of higher accuracy. A sample of the land cover data created by the remote sensing analysis (here referred to as ‘classified’ data) is compared against some validation data (here referred to as ‘reference’ data). The resulting cross tabulation of classified data against reference data is commonly known as the error matrix, but in the literature is also called the confusion, contingency or validation matrix. The cross tabulation provided by the error matrix allows a number of standard reporting measures to be calculated including overall accuracy as well as user's and producer's accuracies (Congalton, 1991, Congalton and Green, 1999). These accuracy statistics provide measures of the reliability of the classified data and the degree (but not spatial extent) to which they are correct. Therefore the appropriateness of the information conveyed by the error matrix may be limited when specific local conditions vary, for example when non-stationary error distributions occur, or in the presence of heteroscedastic residual distributions — i.e. when sub-sets of the data vary from the overall trend (Stehman, 2000, Stehman, 2006).

There are two related limitations associated with accuracy assessments and error summaries calculated from the error matrix (McGwire & Fisher, 2001):

  • 1)

    The error matrix and the accuracy measures it supports provide no information about the spatial distribution of error;

  • 2)

    The overall accuracy measures derived from the error matrix may be inappropriate for sub-regions, where local error rates may be much larger or smaller than the global measures.

Overcoming such problems is important because many users of land cover data may be interested only in a particular subset of the data, either relating to a specific locale or to specific classes.

Some work in the remote sensing literature has explored the spatial distribution of different types of error and methods for reporting it. Campbell (1981) compared Landsat multispectral scanner images in the same growing season and found that misclassified pixels tended to be clustered. Congalton (1988) applied a Getis and Ord approach to analyse join count statistics to compare two datasets. Steele et al. (1998) used kriging to provide an optimal interpolation of map error. McGwire and Fisher (2001) recommended the use of Monte Carlo approaches to model the spatial distribution of errors. Some more recent research has examined the variability or non-stationarity of the distribution of errors. Riemann et al. (2010) describe a number of metrics for characterising the accuracy of spatial data that are dependent on reference data properties and Foody (2005) estimated local accuracy measures by interpolating the outputs of confusion matrices calculated at regular spaced intervals. Current validation and accuracy techniques in remote sensing have largely ignored the advances supported by such methods.

This research is in the spirit of Foody (2005). It uses Geographically Weighted Regression (GWR), a statistical method that explicitly deals with spatial non-stationarity (Brunsdon et al., 1996, Fotheringham et al., 2002), and a geographically weighted difference measure to analyse the spatial variations in the relationship between reference data and classified data for Boolean and fuzzy classes respectively. Geographically weighted approaches estimate spatially distributed measures of accuracy that are more informative than those provided by the confusion matrix. The paper proceeds as follows. Section 2 describes some of the scientific background to error matrices and their use in Boolean and fuzzy classifications. The methods and GWR are described in Section 3. Section 4 presents the results before a discussion of the issues arising from this research (Section 5) and some conclusions are drawn (Section 6).

Section snippets

Background

It is typical for the quality of spatial data such as land cover from remotely sensed imagery to be described using measures of thematic accuracy. The origins of the requirement of at least 85% thematic map accuracy can be traced back to Anderson (1971). Although the scientific basis for this accuracy level has been criticised (Congalton and Green, 1999, Pontius and Millones, 2011), it is historically related to land information being used for taxation assessments (Fisher, 1991, Fisher et al.,

Data and study area

The area of the present study is located in the north western part of Libya in Jifara Plain, around Tripoli. Satellite imagery from the Système Pour l'Observation de la Terre (SPOT) 5 sensor from 2009 was resampled to 30 m × 30 m as part of a wider study examining land cover changes using Landsat data from 1976, 1989 and 2005. It was classified into 6 classes: Urban, Woodland, Vegetation, Grazing Land and Bare areas and Water. Water was not a focus for this research. The class descriptions are

Results

The validation data were used to construct a standard error matrix (Table 3). The crisp, Boolean data allow a straightforward comparison between data classified from the remote sensing imagery and the reference data collected in the field, as well as user's and producer's accuracies to be calculated. Whilst it is evident that some classes are more reliably classified than others, the table provides no information about the spatial distribution of either overall error or errors for different

Discussion

The major contributions of this research relate to the development of 1) spatially distributed measures of accuracy using a kernel and distance weighting in geographically weighted accuracy measures; 2) a portmanteau accuracy measure for Boolean land cover data; and 3) a fuzzy difference measure to describe the accuracy of fuzzy classifications. The outputs of the Boolean analysis, using a portmanteau measure of accuracy, indicate the spatial variation in the extent to which the reference

Conclusions

This research uses geographically weighted approaches to describe the spatial variation in the accuracy of Boolean and fuzzy classifications of remotely sensed data. It proposes a portmanteau approach to describe Boolean land cover accuracy and fuzzy difference measures to describe the accuracy of fuzzy land cover. It addresses two long-standing gaps in the analysis and communication of accuracy and error land cover. First, by analysing the spatial distribution of errors it provides a better

Acknowledgements

The authors would like to thank the anonymous reviewers whose meticulous consideration of the earlier drafts and related comments have resulted in a much improved paper.

References (50)

  • D. Rocchini

    While Boolean sets non-gently rip: A theoretical framework on fuzzy sets for mapping landscape patterns

    Ecological Complexity

    (2010)
  • B.M. Steele et al.

    Estimation and mapping of misclassification probabilities for thematic land cover maps

    Remote Sensing of Environment

    (1998)
  • S.V. Stehman

    Practical implications of design-based sampling inference for thematic map accuracy assessment

    Remote Sensing of Environment

    (2000)
  • J.R. Anderson

    Land use classification schemes used in selected recent geographic applications of remote sensing

    Photogrammetric Engineering

    (1971)
  • C. Arnot et al.

    Mapping the ecotone with Fuzzy sets

  • C.F. Brunsdon et al.

    Geographically weighted regression — A method for exploring spatial non-stationarity

    Geographical Analysis

    (1996)
  • J. Campbell

    Spatial correlation effects upon accuracy of supervised classification of land cover

    Photogrammetric Engineering and Remote Sensing

    (1981)
  • J.B. Campbell

    Introduction to remote sensing

    (2007)
  • A.J. Comber et al.

    Using semantics to clarify the conceptual confusion between land cover and land use: The example of ‘forest’

    Journal of Land use Science

    (2008)
  • R.G. Congalton

    Using spatial auto-correlation analysis to explore the errors in maps generated from remotely sensed data

    Photogrammetric Engineering and Remote Sensing

    (1988)
  • R.G. Congalton et al.

    Assessing the accuracy of remotely sensed data: Principles and practices

    (1999)
  • J.J. Feddema et al.

    The importance of land-cover change in simulating future climates

    Science

    (2005)
  • P.F. Fisher

    Modelling soil map‐unit inclusions by Monte Carlo simulation

    International Journal of Geographical Information Systems

    (1991)
  • P.F. Fisher

    The pixel: A snare and a delusion

    International Journal of Remote Sensing

    (1997)
  • P.F. Fisher

    Improved modelling of elevation error with geostatistics

    GeoInformatica

    (1998)
  • Cited by (171)

    View all citing articles on Scopus
    View full text