Hillslope chemical weathering across Paraná, Brazil: A data mining-GIS hybrid approach
Highlights
► We model soil geochemistry. The self-organizing map identifies weathering relations among relief features. ► Cross-component plots of soil geochemistry reveal higher calcium proportions at concave areas with convergent hydrological flux and lower proportions for convex areas with divergent flux. ► Relation between soil geochemistry, elevation and concave–convex hillslope features reveals that subsurface weathering and transport is an important process.
Introduction
Terrain morphometric features reflect the physical and chemical weathering processes by which they were created (Heimsath et al., 1997). Understanding weathering therefore requires knowledge of phenomena that influence the landscape formation. Early modeling approaches were used to quantify weathering from a physical mass balance viewpoint (Roering et al., 1999). Empirical models to survey multidimensional geochemical data were developed using multivariate statistical methods that included multiple linear regression (Stewart et al., 2003), principal component analysis (Reimann et al., 2002), and cluster analysis (Hanesch et al., 2001). For these models to be reliable, however, the data had to be normally distributed, stationary, and have no co-linearity among independent (explanatory) variables (Netter et al., 1996). In addition to penalizing higher numbers of explanatory variables (Netter et al., 1996), these techniques resulted in losing important nonlinear associations. These assumptions are particularly problematic, because according to Reimann and Filzmoser (1999), at the regional scale, geochemical data do not have normal or lognormal distribution. For these and other reasons, an alternative is the development and application of numerical models.
Early numerical models considered the hillslope to be uniform (rectilinear) along its extension with no provision for transport and deposition rate heterogeneity. Investigators improved on this model type by introducing a nonlinear sediment transport rate through morphometry characterized by a convex hilltop, rectilinear middle section, and concave base (Roering et al., 1999). Mudd and Furbish (2004) formulated a model that coupled physical sediment transport to chemical deposition–denudation in the hillslope weathering process. One simplifying assumption in their model was constant elevation over the time period being modeled. Application of this model revealed that the total amount of mass transported by chemical weathering increased nonlinearly with distance from the hillslope ridge, while at the rectilinear inflexion point the mechanical transport began to decrease (Fig. 1a). Yoo et al. (2007) applied a similar model to soil geochemical measurements collected along a sampling traverse in southeastern Australia. Their simulated chemical weathering rates revealed a hillslope mass loss near the divide and an accumulation near the base (Fig. 1b). Along the hillslope, three distinct geochemical environments were recognized based on the concentration of predominate dissolved ions: (1) Si, Al, and Fe at the hillslope top indicated an oxidizing environment with decreased weathering rates towards the base; (2) Ca, Mg, Na, K at the rectilinear section indicated a neutral pH environment; and (3) P and Ca at the base indicated a reducing environment in which gains in mass were comparable with losses in the upper sections. This finding, together with simulations indicating higher soil moisture content in concave areas compared with convex areas, demonstrated the direct link between element mobility and soil physical–chemical conditions, such as moisture, pH, temperature, and porosity.
Some challenges in the construction and application of numerical hillslope models are their one-dimensionality, steady-state requirements, lack of calibration data, and nonuniqueness. Also, numerical models commonly are too rigid with respect to detecting unexpected features like the onset of trends, non-linear relations, or patterns restricted to sub-samples of a data set. These shortcomings created the need for an alternate modeling approach capable of using available data. One technique that is well-suited to noisy, sparse, nonlinear, multidimensional, and scale-dependent data is a type of unsupervised artificial neural network called the self-organizing map (Kohonen, 2001). The self-organizing map (SOM) technique has been used in related studies to explore relations among rock geochemistry and hyper-spectral images (Penn, 2005), classify geomorphometric aspect based on digital elevation models (Ehsani and Quiel, 2008), characterize hillslope landslide vulnerability (Hentati et al., 2010), identify processes controlling the distribution of iron in soil and sediment (Löhr et al., 2010), and investigate the geochemistry in shallow groundwater. The aim of this study is to understand scale-dependent relations among soil geochemical weathering and morphometric features across the state of Paraná in southeastern Brazil. The hypothesis is that a conceptual hillslope weathering model can be devised based on the statistical relations between field data and metrics of GIS (geographical information systems). To achieve the goal and to satisfy the hypothesis, the following objectives are undertaken: (1) analyze nonlinear relations among published B-horizon soil geochemical, environmental, relief morphometry, and GIS data from 304 locations using the SOM (Kohonen, 2001) and component planes visualization (Penn, 2005) techniques; (2) identify conceptual models of soil geochemical weathering processes based on k-means clustering (Vesanto and Alhoniemi, 2000) of the SOM topography for future development of predictive (empirical and numerical) models; and (3) evaluate bias and uncertainty in the quantized vector predictions and soil classifications using a stochastic cross validation technique (Rao et al., 2008).
Section snippets
Study area
Paraná is a state of Brazil, located in the South of the country. According to the Instituto Brasileiro de Geografia e Estatística–IBGE, the state covers about 199,314 km2 and is home to about 10 million people living in 399 cities. Its gross domestic product ranks 5th in Brazil, producing about 6.2% of the national wealth. The predominant climate is characterized as subtropical with warm summers and cold winters. According to the Köppen classification, the climate has three variants: Cfa, Cfb
Methods
Five steps were used to identify hillslope weathering relations linking the soil geochemistry to relief morphometric features. First, all data variables were standardized so that no one variable would dominate in the nonlinear modeling process (Kalteth et al., 2008). The z-score transformation is given by:where z is the standardized value; x is the raw score; is the sample average, and s is the sample standard deviation, i is an index for each variable. Standardizing variables in
Cross validation
The model performance was evaluated using a stochastic cross validation approach (Rao et al., 2008). The approach consisted of five steps: leave out one sample, recreate a new SOM, estimate values, and analyze residuals. This process was applied to each variable 30 times. For each variable, the average prediction value for 30 realizations was computed and plotted against observed values to assess model bias (Fig. 5). Aside from one outlier in the Ca and Al predictions (confidence interval of
Conclusions
This study found that it is possible to use data mining techniques for the evaluation of multi-scale hillslope chemical weathering processes. Using a type of unsupervised artificial neural network, called the self-organizing map (SOM), multidimensional soil geochemical and geophysical variables can be projected onto a two-dimensional surface while preserving important nonlinear relations. Grouping nonlinear relations using the k-means clustering technique facilitates the development of
Acknowledgements
We are grateful to Coordenação de Aperfeiçoamento de Pessoal de Nível Superior and Conselho Nacional de Desenvolvimento Científico e Tecnológico for their financial support; to Victor F. Labson, Director, Crustal Geophysics and Geochemistry Science Center (CGGSC), U.S. Geological Survey (USGS), Denver, Colorado, for providing the first author with the position of visiting scientist; to the Paraná Agronomic Institute represented by Mario Miyazawa, who provided us complementary information about
References (40)
- et al.
Comparison of self-organizing maps classification approach with cluster and principal components analysis for large environmental datasets
Water Res.
(2007) - et al.
The age curves of sulfur and oxygen isotopes in marine sulfur and their mutual interpretation
Chem. Geology
(1980) - et al.
Geomorphometric feature analysis using morphometric parametrization and artificial neural networks
Geomorphology
(2008) - et al.
The application of fuzzy C-means cluster analysis and non-linear mapping to a soil data set for the detection of polluted sites
Phys. Chem. Earth (A)
(2001) - et al.
Evaluation of sedimentation vulnerability at small hillslide reservoir in the semi-arid region of Tunisia using Self-Organizing Map
Geomorphology
(2010) - et al.
Iron occurrence in soils and sediments of coastal catchment: a multivariate approach using self-organizing maps
Geoderma
(2010) - et al.
Soil-landscape delineation to define spatial sampling domains for hillslope hydrology
J. Hydrol.
(2004) - et al.
Factor analysis applied to regional geochemical data: problems and possibilities
Appl. Geochem.
(2002) - et al.
Modeling small watersheds in Brazilian Amazonia with shuttle radar topographic mission-90 m data
Comput. Geosci.
(2006) Artificial Neural Networks in Hydrology. I: preliminary concepts
J. Hydrologic Eng.
(2000)
Soil Geomorphology
Metalogênese do Brasil
Sistema Brasileiro de Classificação de Solos
Shuttle radar topography mission produces a wealth of data
Am. Geophys. Union EOS
O Grupo Caiuá (Ks): revisão estratigráfica e contexto deposicional
Revista Brasileira de Geociências
Soil geomorphology: an integration of pedology and geomorphology
The soil production function and the landscape equilibrium
Nature
Reassessment of Polynov's ion mobility series
Soil Sci. Soc. Am. J.
Banco de dados por Estados
Extracting topographic structure from digital elevation data for geographic information system analysis
Photogrammetric Eng. and Remote Sensing
Cited by (17)
Method and analysis for the upscaling of structural data
2016, Journal of Structural GeologyCitation Excerpt :There are several different approaches used in uncertainty investigations for classification and analysis of uncertainty from numerous models, either by using principal component analyses to analyse the significance of each metric and their relation (Lindsay et al., 2013b) or by the visualisation and analysis of each metric individually (Wellmann et al., 2010; Wellmann and Regenauer-Lieb, 2012). Self-organizing maps (SOMs) (Kohonen, 1982) are an unsupervised, artificial neural network technique that has been applied to a wide range of disciplines as a data mining tool, such as ecology (Giraudel and Lek, 2001; Jin et al., 2011; Park et al., 2007) and geochemistry (Iwashita et al., 2011; Löhr et al., 2010) and have been shown to give a better insight than more traditional statistical methods. The use of SOMs on the classification of geological models has yet to be published and has several advantages over previously outlined techniques as it can identify non-linear relationships and be used on incomplete datasets.
Modeling rock weathering in small watersheds
2014, Journal of HydrologyCitation Excerpt :The study of rock weathering spans a wide range of disciplines, including petrography and geochemistry, soil science, hydrology and geomorphology (Beig and Lüttge, 2006; Hausrath et al., 2011; Iwashita et al., 2011; Meunier et al., 2007; Middelburg et al., 1988; Ouimet, 2008; Pacheco and Van der Weijden, 2012a,b, 2014; Van der Weijden and Van der Weijden, 1995; Violette et al., 2010; Yoo et al., 2009).
Data-driven modeling of background and mine-related acidity and metals in river basins
2014, Environmental PollutionCitation Excerpt :This strategy guarantees a minimum bias of the estimated prediction error (Hastie et al., 2002) and enables nonlinear statistics to be computed for the trained SOM. Some examples of the stochastic cross-validation in environmental modeling applications are provided by (Iwashita et al., 2011a,b) and (Friedel, 2011, 2012). After training, the SOM contains an approximate representation of the input data vectors.
Hybrid modeling of spatial continuity for application to numerical inverse problems
2013, Environmental Modelling and SoftwareCitation Excerpt :The proposed approach overcomes this problem by first using a type of unsupervised artificial neural network, called a self-organizing map (SOM), to estimate variables at random locations across the model domain. Recent application of the SOM in environmental problems include imputation of time series (Malek et al., 2008), surface-water hydrographs (Kalteh and Berndtsson, 2008; Kalteh and Hjorth, 2009), hillslope weathering (Iwashita et al., 2011), fluidized bed combustion process and NOx emissions (Liukkonen et al., 2011), post-fire hydrologic and geomorphic hazards (Friedel, 2011), crop evapotranspiration (Adeloye et al., 2012), and global-scale temperature and solar activity (Friedel, 2012). Conceptually cross-validation could be used in combination with the stochastic SOM procedure to evaluate spatial uncertainty in SOM estimates, but it is not practical because of the time (months to years on PC with an i7 processor) associated with regenerating maps for each fold and realization.