Support vector machine: A tool for mapping mineral prospectivity
Introduction
Mapping of mineral prospectivity is crucial in mineral resources exploration and mining. It involves integration of information from diverse geoscience datasets including geological data (e.g., geological map), geochemical data (e.g., stream sediment geochemical data), geophysical data (e.g., magnetic data) and remote sensing data (e.g., multispectral satellite data). These sorts of data can be visualized, processed and analyzed with the support of computer and GIS techniques. Geocomputational techniques for mapping mineral prospectivity include weights of evidence (WofE) (Bonham-Carter et al., 1989), fuzzy WofE (Cheng and Agterberg, 1999), logistic regression (Agterberg and Bonham-Carter, 1999), fuzzy logic (FL) (Ping et al., 1991), evidential belief functions (EBF) (An et al., 1992, Carranza and Hale, 2003, Carranza et al., 2005), neural networks (NN) (Singer and Kouda, 1996, Porwal et al., 2003, Porwal et al., 2004), a ‘wildcat’ method (Carranza, 2008, Carranza, 2010, Carranza and Hale, 2002) and a hybrid method (e.g., Porwal et al., 2006, Zuo et al., 2009). These techniques have been developed to quantify indices of occurrence of mineral deposit occurrence by integrating multiple evidence layers. Some geocomputational techniques can be performed using popular software packages, such as ArcWofE (a free ArcView extension) (Kemp et al., 1999), ArcSDM 9.3 (a free ArcGIS 9.3 extension) (Sawatzky et al., 2009), MI-SDM 2.50 (a MapInfo extension) (Avantra Geosystems, 2006), GeoDAS (developed based on MapObjects, which is an Environmental Research Institute Development Kit) (Cheng, 2000). Other geocomputational techniques (e.g., FL and NN) can be performed by using R and Matlab.
Geocomputational techniques for mineral prospectivity mapping can be categorized generally into two types – knowledge-driven and data-driven – according to the type of inference mechanism considered (Bonham-Carter, 1994, Pan and Harris, 2000, Carranza, 2008). Knowledge-driven techniques, such as those that apply FL and EBF, are based on expert knowledge and experience about spatial associations between mineral prospectivity criteria and mineral deposits of the type sought. On the other hand, data-driven techniques, such as WofE and NN, are based on the quantification of spatial associations between mineral prospectivity criteria and known occurrences of mineral deposits of the type sought. Additional, the mixing of knowledge-driven and data-driven methods also is used for mapping of mineral prospectivity (e.g., Porwal et al., 2006, Zuo et al., 2009). Every geocomputational technique has advantages and disadvantages, and one or the other may be more appropriate for a given geologic environment and exploration scenario (Harris et al., 2001). For example, one of the advantages of WofE is its simplicity, and straightforward interpretation of the weights (Pan and Harris, 2000), but this model ignores the effects of possible correlations amongst input predictor patterns, which generally leads to biased prospectivity maps by assuming conditional independence (Porwal et al., 2010). Comparisons between WofE and NN, NN and LR, WofE, NN and LR for mineral prospectivity mapping can be found in Singer and Kouda (1999), Harris and Pan (1999) and Harris et al. (2003), respectively.
Mapping of mineral prospectivity is a classification process, because its product (i.e., index of mineral deposit occurrence) for every location is classified as either prospective or non-prospective according to certain combinations of weighted mineral prospectivity criteria. There are two types of classification techniques. One type is known as supervised classification, which classifies mineral prospectivity of every location based on a training set of locations of known deposits and non-deposits and a set of evidential data layers. The other type is known as unsupervised classification, which classifies mineral prospectivity of every location based solely on feature statistics of individual evidential data layers.
A support vector machine (SVM) is a model of algorithms for supervised classification (Vapnik, 1995). Certain types of SVMs have been developed and applied successfully to text categorization, handwriting recognition, gene-function prediction, remote sensing classification and other studies (e.g., Joachims, 1998, Huang et al., 2002, Cristianini and Scholkopf, 2002, Guo et al., 2005, Kavzoglu and Colkesen, 2009). An SVM performs classification by constructing an n-dimensional hyperplane in feature space that optimally separates evidential data of a predictor variable into two categories. In the parlance of SVM literature, a predictor variable is called an attribute whereas a transformed attribute that is used to define the hyperplane is called a feature. The task of choosing the most suitable representation of the target variable (e.g., mineral prospectivity) is known as feature selection. A set of features that describes one case (i.e., a row of predictor values) is called a feature vector. The feature vectors near the hyperplane are the support feature vectors. The goal of SVM modeling is to find the optimal hyperplane that separates clusters of feature vectors in such a way that feature vectors representing one category of the target variable (e.g., prospective) are on one side of the plane and feature vectors representing the other category of the target variable (e.g., non-prospective) are on the other size of the plane. A good separation is achieved by the hyperplane that has the largest distance to the neighboring data points of both categories, since in general the larger the margin the better the generalization error of the classifier. In this paper, SVM is demonstrated as an alternative tool for integrating multiple evidential variables to map mineral prospectivity.
Section snippets
Support vector machine algorithms
Support vector machines are supervised learning algorithms, which are considered as heuristic algorithms, based on statistical learning theory (Vapnik, 1995). The classical task of a SVM is binary (two-class) classification. Suppose we have a training set composed of l feature vectors xi∈Rn, where i (=1, 2, …, n) is the number of feature vectors in training samples. The class in which each sample is identified to belong is labeled yi, which is equal to 1 for one class or is equal to −1 for the
Experiments with kernel functions
For spatial geocomputational analysis of mineral exploration targets, the decision function in Eq. (3) is a kernel function. The choice of a kernel function (K) and its parameters for an SVM are crucial for obtaining good results. The kernel function can be used to construct a non-linear decision boundary and to avoid expensive calculation of dot products in high-dimensional feature space. The four popular kernel functions are as follows:
Prospectivity mapping in the study area
The study area is located in western Meguma Terrain of Nova Scotia, Canada. It measures about 7780 km2. The host rock of Au deposits in this area consists of Cambro-Ordovician low-middle grade metamorphosed sedimentary rocks and a suite of Devonian aluminous granitoid intrusions (Sangster, 1990, Ryan and Ramsay, 1997). The metamorphosed sedimentary strata of the Meguma Group are the lower sand-dominated flysch Goldenville Formation and the upper shaly flysch Halifax Formation occurring in the
Conclusions
Nowadays, SVMs have become a popular geocomputational tool for spatial analysis. In this paper, we used an SVM algorithm to integrate multiple variables for mineral prospectivity mapping. The results obtained by two SVM applications demonstrate that prospective target areas for Au deposits are defined mainly by proximity to NE–SW trending anticlines and to contact zones between the Goldenville and Halifax Formations. In the study area, the SVM classifications of mineral prospectivity have 5–9%
Acknowledgements
The first author is thankful to Prof. Qiuming Cheng for the datasets used in this study. We would like to thank Prof. Frits Agterberg, and two anonymous reviewers for constructive comments and suggestions. This research benefited from the joint financial support from the Most Special Fund of the State Key Laboratory of Geological Processes and Mineral Resources, China University of Geosciences (No.MSFGPMR200912), the Fundamental Research Funds for the Central Universities (No.CUGL090212), and
References (44)
- et al.
Evidential belief functions for geologically constrained mapping of gold potential, Baguio district, Philippines
Ore Geology Reviews
(2003) - et al.
Selection of coherent deposit-type locations and their application in data-driven mineral prospectivity mapping
Ore Geology Reviews
(2008) - et al.
Support vector machines for prediction distribution of Sudden Oak Death in California
Ecological Modeling
(2005) - et al.
A kernel functions analysis for support vector machines for land cover classification
International Journal of Applied Earth Observation and Geoinformation
(2009) - et al.
Support vector domain description
Pattern Recognition Letters
(1999) - et al.
Application of a hybrid method combining multilevel fuzzy comprehensive evaluation with asymmetric fuzzy relation analysis to mapping prospectivity
Ore Geology Reviews
(2009) Computer programs for mineral exploration
Science
(1989)- Agterberg, F.P., Bonham-Carter, G.F., 1999. Logistic regression and weights of evidence modeling in mineral...
- An, P., Moon, W.M., Bonham-Carter, G.F., 1992. On knowledge-based approach on integrating remote sensing, geophysical...
- Avantra Geosystems, 2006. A Tutorial Guide to using MI-SDM v2.50 based on USGS Open-File Report 01-221 by Gary L....