Spatial analysis of remote sensing image classification accuracy
Highlights
► The confusion matrix provides no information on the spatial distribution of errors. ► The spatial distribution of correspondence provides richer accuracy information. ► Geographically weighted models were used to map Boolean and Fuzzy accuracy. ► This is a methodological advance in accuracy assessment in remote sensing.
Introduction
Land cover information can be generated through the classification of remotely sensed data. Areas or pixels with similar spectral characteristics are allocated to classes or categories each of which represents a different type of land cover feature. It is a process of generalisation, and involves a number of choices about image type, resolution, number and types of classes, training sites, etc. (Campbell, 2007). Assessing map accuracy in an objective manner is fundamental to most land cover mapping projects (Foody, 2002, Strahler et al., 2006). The accepted paradigm for doing this is through comparison with some alternative data in order to determine measures of accuracy which “express the degree of ‘correctness’ of a map or classification” (Foody, 2002, p186). Determining the accuracy of land cover classified from remotely sensed data is important. Land cover is an input into environmental models incorporating land-atmosphere interactions (GLP, 2005) and land cover change is a major variable in climate change analyses (Feddema et al., 2005). In this context, accuracy descriptions can help the user to assess the uncertainties associated with incorporating land cover data into their model or to decide between land cover datasets, especially where there is a choice between data with different thematic or spatial characteristics (See & Fritz, 2006). Thus accuracy is one of the key aspects of any remotely sensed data product.
The most common approach for assessing thematic map accuracy is to compare the classified land cover with alternative but spatially and temporally coincident data, which are considered to be of higher accuracy. A sample of the land cover data created by the remote sensing analysis (here referred to as ‘classified’ data) is compared against some validation data (here referred to as ‘reference’ data). The resulting cross tabulation of classified data against reference data is commonly known as the error matrix, but in the literature is also called the confusion, contingency or validation matrix. The cross tabulation provided by the error matrix allows a number of standard reporting measures to be calculated including overall accuracy as well as user's and producer's accuracies (Congalton, 1991, Congalton and Green, 1999). These accuracy statistics provide measures of the reliability of the classified data and the degree (but not spatial extent) to which they are correct. Therefore the appropriateness of the information conveyed by the error matrix may be limited when specific local conditions vary, for example when non-stationary error distributions occur, or in the presence of heteroscedastic residual distributions — i.e. when sub-sets of the data vary from the overall trend (Stehman, 2000, Stehman, 2006).
There are two related limitations associated with accuracy assessments and error summaries calculated from the error matrix (McGwire & Fisher, 2001):
- 1)
The error matrix and the accuracy measures it supports provide no information about the spatial distribution of error;
- 2)
The overall accuracy measures derived from the error matrix may be inappropriate for sub-regions, where local error rates may be much larger or smaller than the global measures.
Overcoming such problems is important because many users of land cover data may be interested only in a particular subset of the data, either relating to a specific locale or to specific classes.
Some work in the remote sensing literature has explored the spatial distribution of different types of error and methods for reporting it. Campbell (1981) compared Landsat multispectral scanner images in the same growing season and found that misclassified pixels tended to be clustered. Congalton (1988) applied a Getis and Ord approach to analyse join count statistics to compare two datasets. Steele et al. (1998) used kriging to provide an optimal interpolation of map error. McGwire and Fisher (2001) recommended the use of Monte Carlo approaches to model the spatial distribution of errors. Some more recent research has examined the variability or non-stationarity of the distribution of errors. Riemann et al. (2010) describe a number of metrics for characterising the accuracy of spatial data that are dependent on reference data properties and Foody (2005) estimated local accuracy measures by interpolating the outputs of confusion matrices calculated at regular spaced intervals. Current validation and accuracy techniques in remote sensing have largely ignored the advances supported by such methods.
This research is in the spirit of Foody (2005). It uses Geographically Weighted Regression (GWR), a statistical method that explicitly deals with spatial non-stationarity (Brunsdon et al., 1996, Fotheringham et al., 2002), and a geographically weighted difference measure to analyse the spatial variations in the relationship between reference data and classified data for Boolean and fuzzy classes respectively. Geographically weighted approaches estimate spatially distributed measures of accuracy that are more informative than those provided by the confusion matrix. The paper proceeds as follows. Section 2 describes some of the scientific background to error matrices and their use in Boolean and fuzzy classifications. The methods and GWR are described in Section 3. Section 4 presents the results before a discussion of the issues arising from this research (Section 5) and some conclusions are drawn (Section 6).
Section snippets
Background
It is typical for the quality of spatial data such as land cover from remotely sensed imagery to be described using measures of thematic accuracy. The origins of the requirement of at least 85% thematic map accuracy can be traced back to Anderson (1971). Although the scientific basis for this accuracy level has been criticised (Congalton and Green, 1999, Pontius and Millones, 2011), it is historically related to land information being used for taxation assessments (Fisher, 1991, Fisher et al.,
Data and study area
The area of the present study is located in the north western part of Libya in Jifara Plain, around Tripoli. Satellite imagery from the Système Pour l'Observation de la Terre (SPOT) 5 sensor from 2009 was resampled to 30 m × 30 m as part of a wider study examining land cover changes using Landsat data from 1976, 1989 and 2005. It was classified into 6 classes: Urban, Woodland, Vegetation, Grazing Land and Bare areas and Water. Water was not a focus for this research. The class descriptions are
Results
The validation data were used to construct a standard error matrix (Table 3). The crisp, Boolean data allow a straightforward comparison between data classified from the remote sensing imagery and the reference data collected in the field, as well as user's and producer's accuracies to be calculated. Whilst it is evident that some classes are more reliably classified than others, the table provides no information about the spatial distribution of either overall error or errors for different
Discussion
The major contributions of this research relate to the development of 1) spatially distributed measures of accuracy using a kernel and distance weighting in geographically weighted accuracy measures; 2) a portmanteau accuracy measure for Boolean land cover data; and 3) a fuzzy difference measure to describe the accuracy of fuzzy classifications. The outputs of the Boolean analysis, using a portmanteau measure of accuracy, indicate the spatial variation in the extent to which the reference
Conclusions
This research uses geographically weighted approaches to describe the spatial variation in the accuracy of Boolean and fuzzy classifications of remotely sensed data. It proposes a portmanteau approach to describe Boolean land cover accuracy and fuzzy difference measures to describe the accuracy of fuzzy land cover. It addresses two long-standing gaps in the analysis and communication of accuracy and error land cover. First, by analysing the spatial distribution of errors it provides a better
Acknowledgements
The authors would like to thank the anonymous reviewers whose meticulous consideration of the earlier drafts and related comments have resulted in a much improved paper.
References (50)
- et al.
Geographically weighted summary statistics — A framework for localized exploratory data analysis
Computers Environment and Urban Systems
(2002) - et al.
The effect of spatial autocorrelation and class proportion on the accuracy measures from different sampling designs
ISPRS Journal of Photogrammetry and Remote Sensing
(2009) A review of assessing the accuracy of classifications of remotely sensed data
Remote Sensing of Environment
(1991)- et al.
Object-based analysis and change detection of major wetland cover types and their classification uncertainty during the low water period at Poyang Lake, China
Remote Sensing of Environment
(2011) Remote sensing of land cover classes as type 2 Fuzzy sets
Remote Sensing of Environment
(2010)Status of land cover classification accuracy assessment
Remote Sensing of Environment
(2002)Geographical weighting as a further refinement to regression modelling: An example focused on the NDVI–rainfall relationship
Remote Sensing of Environment
(2003)- et al.
Forest carbon densities and uncertainties from Lidar, QuickBird, and field measurements in California
Remote Sensing of Environment
(2010) - et al.
Modeling moulin distribution on Sermeq Avannarleq glacier using ASTER and WorldView imagery and Fuzzy set theory
Remote Sensing of Environment
(2011) - et al.
An effective assessment protocol for continuous geospatial datasets of forest characteristics using USFS Forest Inventory and Analysis (FIA) data
Remote Sensing of Environment
(2010)
While Boolean sets non-gently rip: A theoretical framework on fuzzy sets for mapping landscape patterns
Ecological Complexity
Estimation and mapping of misclassification probabilities for thematic land cover maps
Remote Sensing of Environment
Practical implications of design-based sampling inference for thematic map accuracy assessment
Remote Sensing of Environment
Land use classification schemes used in selected recent geographic applications of remote sensing
Photogrammetric Engineering
Mapping the ecotone with Fuzzy sets
Geographically weighted regression — A method for exploring spatial non-stationarity
Geographical Analysis
Spatial correlation effects upon accuracy of supervised classification of land cover
Photogrammetric Engineering and Remote Sensing
Introduction to remote sensing
Using semantics to clarify the conceptual confusion between land cover and land use: The example of ‘forest’
Journal of Land use Science
Using spatial auto-correlation analysis to explore the errors in maps generated from remotely sensed data
Photogrammetric Engineering and Remote Sensing
Assessing the accuracy of remotely sensed data: Principles and practices
The importance of land-cover change in simulating future climates
Science
Modelling soil map‐unit inclusions by Monte Carlo simulation
International Journal of Geographical Information Systems
The pixel: A snare and a delusion
International Journal of Remote Sensing
Improved modelling of elevation error with geostatistics
GeoInformatica
Cited by (171)
Mapping planted forest age using LandTrendr algorithm and Landsat 5–8 on the Loess Plateau, China
2024, Agricultural and Forest MeteorologyPerceived barriers and advances in integrating earth observations with water resources modeling
2024, Remote Sensing Applications: Society and EnvironmentPer-pixel accuracy as a weighting criterion for combining ensemble of extreme learning machine classifiers for satellite image classification
2023, International Journal of Applied Earth Observation and GeoinformationAn integrated approach of deep learning convolutional neural network and google earth engine for salt storm monitoring and mapping
2023, Atmospheric Pollution Research