Statistics and GIS in environmental geochemistry — some problems and solutions
Introduction
With the rapid development of computer technology, statistics and GIS are receiving increasing attention in environmental geochemistry. However, it is important to understand the functions and limitations, the advantages and disadvantages of these techniques for better understanding their applications. The problems of probability distribution, mean calculation, spatial structure, correlation, database management, visualization, prediction, decision support, outlier detection, and differentiation of anthropogenic from natural backgrounds are interesting to geochemists. The problems and possible solutions with the application of statistics and GIS in environmental geochemistry are discussed in this paper. Examples are based on the authors' experiences in the Yangtze River basin, China, and in southeastern Sweden.
During the 1980s, two massive environmental protection projects (`A Study on Background Values in the Yangtze River System', 1986–1990, and `A Study on Background Values in the Dongting Lake System, 1981–1985) were carried out in China, and a large quantity of data were acquired. Statistics, including univariate, multivariate and spatial statistics have been carried out for the data by the authors (Zhang et al., 1995; Zhang and Selinus, 1997).
The Geological Survey of Sweden (SGU) initiated a national mapping program in 1982 with three types of geochemical samples (bedrock, till, and biogeochemistry) with the objective to produce a detailed geochemical atlas of the entire country, and the program is still going on. All the data are stored in a database at SGU. Problems concerning the calculation of mean values, outlier detection, relationships among elements, spatial distribution features, and relationships among the different types of geochemical samples, differentiation of anthropogenic from natural background have been investigated by the authors (Zhang et al., 1998a, Zhang et al., 1998b; Zhang and Selinus, 1998a, Zhang and Selinus, 1998b). Because of the large amount of data, a small area in the southeastern part of Sweden, where much varying information is available, has been chosen for study. The size of the area is 75 × 75 km2 (Fig. 1). As the direction of glacier movement during the last ice age was from north-northwest to south-southeast in the region (Lundqvist, 1994), the bedrocks in the north and west part of the area will contribute to the heavy metal concentrations in tills and roots in the area under study. Therefore, the bedrock map is enlarged 25 km in both the northern and western directions. A simplified overview of the geology of the project area is shown in Fig. 1.
Four major types of bedrocks are distributed in the area: acid volcanic rocks, granite rocks, basic rocks, and sedimentary rocks. Lakes occupy 2.1% of the area, and the area is dominated by granites, with a percentage of 64.3%. Acid volcanic rocks are the second most common, accounting for 21.5%. Only 3.7% of the total area is occupied by shales, sandstones and limestones. The area of basic rocks is also rather small, only 8.5%. The effect of the basic rocks on metal distribution is however significant, which will be shown, as they have high concentrations of heavy metals (except for Pb) and are easily weathered. Lead concentrations, on the other hand, are elevated in the acid volcanic rocks (Zhang et al., 1998b).
Section snippets
Problems in environmental geochemistry and possible solutions
When we are dealing with environmental geochemical data, many questions may be encountered, such as probability distribution, mean calculation, spatial structure, correlation, database management, visualization, prediction, decision support, outlier detection, and differentiation of an anthropogenic from a natural background. These problems can possibly be solved with the aid of statistics (univariate, multivariate, and spatial statistics), GIS, expert systems and environmental information
Summary
The problems and possible solutions discussed above are summarized in Table 4.
Some significant points may be emphasized. The `robust-symmetric mean' proposed by the authors is one of the best means. The combination of univariate methods and PCA is used to detect outlying samples. PLS, PCA, cluster analysis and expert systems are useful to differentiate anthropogenic from natural anomalies. Spatial correlations among environmental geochemical variables are revealed by the cross-variogram.
Acknowledgements
Dr. Chaosheng Zhang thanks the Swedish Institute for providing a scholarship, enabling him to visit the Geological Survey of Sweden. The study is partly supported by the `One-hundred-person Plan' of the Chinese Academy of Sciences and National Natural Science Foundation of China. The authors would like to express their thanks to Dr. Frank Manheim, Dr. Larry Gough and two anonymous reviewers for their helpful comments which improved the paper.
References (25)
- et al.
The fractal nature of geochemical landscapes
J. Geochem. Explor.
(1992) - et al.
Multivariate modelling of geochemical and geophysical exploration data
Chemometrics Int. Lab. Syst.
(1987) - et al.
Multivariate geochemical modelling and integration with petrophysical data
J. Geochem. Explor.
(1987) - et al.
Separating anthropogenic from natural anomalies in environmental geochemistry
J. Geochem. Explor.
(1995) - et al.
UNSCRAMBLER, an interactive program for multivariate calibration and prediction. In: Kvalheim, O.M. (Ed.), Multivariate Statistical Workshop for Geologists and Geochemists
Chemometrics and Intelligent Laboratory Systems
(1987) - et al.
Spatial analyses for copper, lead and zinc contents in sediments of the Yangtze River basin
Sci. Total Environ.
(1997) - et al.
Statistical analyses for heavy metal contents in till and root samples in an area of southeastern Sweden
Sci. Total Environ.
(1998) - Clark, I., 1979. Practical Geostatistics. Applied Science Publishers, London, 129...
- Cliff, A.D., Ord, J.K., 1981. Spatial Processes, Models and Applications. Pion, London, 266...
- Esbensen, K., Steenfelt, A., Martens, H., Tyssö, V., 1987b. Prediction of tungsten potential using PLS-regression on...
Cited by (100)
Medical Geology: A Tribute to Olle Selinus (1943–2023)
2024, Applied GeochemistryCarbonate bedrock control of soil Cd background in Southwestern China: Its extent and influencing factors based on spatial analysis
2022, ChemosphereCitation Excerpt :As the retrieved data follow nearly log-normal distributions rather than normal ones. They were logarithmic transformed (base 10) for the following calculations (Zhang and Selinus, 1998). Curves of their distributions were simulated with R and were plotted.
Incorporation of high accuracy surface modeling into machine learning to improve soil organic matter mapping
2021, Ecological IndicatorsApplication of exploratory and Spatial Data Analysis (SDA), singularity matrix analysis, and fractal models to delineate background of potentially toxic elements: A case study of Ahvaz, SW Iran
2020, Science of the Total EnvironmentCitation Excerpt :Furthermore, this method is usable for identifying superposed populations and setting thresholds among them (Pan et al., 2017; Sinclair, 1991). In an area with complex environmental geochemistry characteristics, exploratory data analysis (EDA) and spatial data analysis (SDA) (hereafter, EDA-SDA) have been proposed as efficacious methods for mapping the geochemical modeles to separate the background from anomaly (hot spots) and facilitate further analysis of the processes following separation of anomaly form background (Zhang and Selinus, 1998; Zhou and Xia, 2010). EDA methods combine graphs, including histogram, boxplot, and normal quantile-quantile (Q-Q) plot, providing valuable insight into characteristics of the data.
Geochemical modeling of orogenic gold deposit using PCANN hybrid method in the Alut, Kurdistan province, Iran
2018, Journal of African Earth Sciences