Bayesian network classifiers for mineral potential mapping

https://doi.org/10.1016/j.cageo.2005.03.018Get rights and content

Abstract

In this paper, we describe three Bayesian classifiers for mineral potential mapping: (a) a naive Bayesian classifier that assumes complete conditional independence of input predictor patterns, (b) an augmented naive Bayesian classifier that recognizes and accounts for conditional dependencies amongst input predictor patterns and (c) a selective naive classifier that uses only conditionally independent predictor patterns. We also describe methods for training the classifiers, which involves determining dependencies amongst predictor patterns and estimating conditional probability of each predictor pattern given the target deposit-type. The output of a trained classifier determines the extent to which an input feature vector belongs to either the mineralized class or the barren class and can be mapped to generate a favorability map. The procedures are demonstrated by an application to base metal potential mapping in the proterozoic Aravalli Province (western India). The results indicate that although the naive Bayesian classifier performs well and shows significant tolerance for the violation of the conditional independence assumption, the augmented naive Bayesian classifier performs better and exhibits finer generalization capability. The results also indicate that the rejection of conditionally dependent predictor patterns degrades the performance of a naive classifier.

Introduction

A Bayesian network is an annotated directed acyclic graph (DAG) that models uncertain relationships amongst variables in a complex system (Fig. 1). Fundamental to Bayesian networks is the idea of modularity, i.e., a complex system can be decomposed into several consistent modules, which are represented by Markov blankets of the variables. The Markov blanket of a variable comprises its parent variables, its child variables and parents of its child variables (Pearl, 1988). Parent and child variables are identified on the basis of mutual dependencies—a child variable is conditionally dependent on a set of parent variables. The Markov blankets of all variables can be connected to obtain a comprehensive representation of the whole complex system.

A Bayesian classifier is a special Bayesian network in which one, and only one, variable represents a class variable and all other variables are considered as attributes characterizing the class variable. The class variable is at the root of the network, i.e., it has no parent variables, while each attribute has at least the class variable as a parent (depending upon the structure of a Bayesian classifier, it is possible for an attribute to have other attributes as parents; see further below). A class variable can have two or more states, with each state representing a discrete class label. Similarly, attributes can also be binary or multi-state. The task of a Bayesian classifier is to map an input feature vector comprising particular instances of attributes to a specific class label. For this, the classifier is trained on a set of pre-classified feature vectors, which results in the induction of conditional probabilities of all attributes given the class variable. The trained classifier applies Bayes’ rule to compute the posterior probabilities of all states of the class variable given the particular instances of attributes in the feature vector and predicts the class label that gets the highest posterior probability. Bayesian classifiers can be efficiently used in mineral potential mapping of an area, because it involves predictive classification of each spatial unit having a unique combination of predictor patterns (unique conditions; Bonham-Carter and Agterberg, 1990) as mineralized or barren.

One simple Bayesian classifier that can be used in mineral potential mapping is naive Bayesian classifier (Duda and Hart, 1973, Langley et al., 1992). However, this classifier assumes complete conditional independence amongst attributes, which is unrealistic for many predictor patterns used in mineral potential mapping. Several Bayesian classifiers unrestricted by the conditional independence assumption are described in literature, for example, semi-naive Bayesian classifier (Kononenko, 1991); Bayesian multinet classifier (Heckerman, 1990, Geiger and Heckerman, 1996), tree-augmented naive Bayesian classifier (Friedman et al., 1997); augmented naive Bayesian classifier (Friedman et al., 1997), etc. selective naive Bayesian classifier (Langley and Sage, 1994) makes classification based only on the conditionally independent variables.

In this paper, we describe and apply the following Bayesian classifiers to a regional-scale base metal potential mapping in part of the Aravalli Province, western India: (a) a naive classifier that assumes conditional independence of predictor patterns, (b) an augmented naive classifier that recognizes and accounts for conditional dependencies amongst predictor patterns and (c) a selective naive classifier that uses only the conditionally independent predictor patterns.

The Prospector expert system developed at Stanford Research Institute used a series of Bayesian inference networks for evaluating mineral deposits (Duda et al., 1978). Although the original Prospector did not allow the use of spatial data, later versions were modified to support spatial data. For example, Duda et al. (1978) used the Prospector for combining predictor patterns in a study of the Island copper deposit, British Columbia. Information was propagated in the networks by (a) Bayesian updating of prior to posterior probabilities and (b) applying fuzzy boolean operators. Likelihood ratios and prior probabilities were estimated based on expert knowledge. Campbell et al. (1982) used the Prospector for mapping the potential of molybdenum deposits in the Mt. Tolman area of Washington State. Katz (1991) implemented the Prospector in a GIS-environment. Reddy et al. (1992) used similar methods to map base metal potential in Manitoba. Yatabe and Fabbri (1988) and Bonham-Carter (1994) give a review of the Prospector.

The weights of evidence approach (Agterberg et al., 1990, Bonham-Carter and Agterberg, 1990) also applies the Bayesian concept of updating of prior to posterior probabilities, but uses exploration data to estimate likelihood ratios and prior probabilities. The posterior probabilities are estimated using Bayes’ equation in a log-linear form, under the assumption of conditional independence of input predictor patterns. Owing to its easy applicability and intuitive approach, it is possibly the most widely used technique in mineral potential mapping.

Section snippets

Bayesian classifiers

Given a finite set U={X1,,Xn} of discrete random variables, a Bayesian network B on U is defined as the pair G,Θ, where G is a DAG and Θ is a set of parameters that quantifies the network (Friedman et al., 1997). The DAG G encodes the following assumption: each variable Xi is independent of its non-descendants, given its parents in G. The set Θ contains a parameter θxi|Πxi for each possible value xi of Xi and Πxi of ΠXi, where ΠXi denotes the set of parents of Xi in G. The Bayesian network B

Application to base metal deposit potential mapping in Aravalli Province, western India

The study area forms a part of the Aravalli metallogenic province in the state of Rajasthan, western India (Fig. 2). Its area is about 34,000km2 and it is located between latitudes 2330N and 26N and longitudes 73E and 75E.

The province is characterized by two fold belts, viz., the Palaeo-Mesoproterozoic Aravalli Fold Belt and the Meso-Neoproterozoic Delhi Fold Belt, which are ingrained in a reworked basement complex that contains incontrovertible Archaean components (Heron, 1953, Roy, 1988,

Discussion

The formation and localization of mineral deposits are the end-results of a complex interplay of several metallogenetic processes that exhibit signatures in the form of geologic features associated with the mineral deposits. These geological features, called recognition criteria, are characterized by their responses in one or more geodata sets that are used as predictors in mineral potential mapping. It is unrealistic to assume independence of the predictors because (a) a particular geologic

Conclusions

  • 1.

    A naive classifier provides an efficient tool for mineral potential mapping. It is easy to construct, train and implement. Although it is based on the strong assumption of conditional independence of input predictor patterns, it shows significant tolerance for the violations of the assumption.

  • 2.

    The performance of a naive classifier is significantly improved if the conditional dependence assumption is relaxed by recognizing and accounting for dependencies amongst the predictor patterns in an

Acknowledgements

The authors thank Dr. G.F. Bonham-Carter and an anonymous referee for their comments, which helped in improving the manuscript.

References (43)

  • Chickering, D.M., Geiger, D., Heckerman, D., 1994. Learning Bayesian networks is NP-hard. Technical Report...
  • G.F. Cooper et al.

    A Bayesian method for the induction of probabilistic networks from data

    Machine Learning

    (1992)
  • M. Deb

    VMS deposits: geological characteristics, genetic models and a review of their metallogenesis in the Aravalli range, NW India

  • P. Domingos et al.

    Beyond Independence: Conditions for optimality of the simple Bayesian classifier

  • P. Domingos et al.

    On the optimality of the simple Bayesian classifier under zero-one loss

    Machine Learning

    (1997)
  • Duda, R.O., Hart, P.E., 1973. Pattern Classification and Scene Analysis. Wiley, New York,...
  • N. Friedman et al.

    Bayesian network classifiers

    Machine Learning

    (1997)
  • Goodfellow, W.D., 2001. Attributes of modern and ancient sediment-hosted, sea-floor hydrothermal deposits. Proceedings...
  • GSI, 1981. Total intensity aeromagnetic map and map showing the magnetic zones of the Aravalli Region, southern...
  • Gupta, S.N., Arora, Y.K., Mathur, R.K., Iqballuddin, Prasad, B., Sahai, T.N., Sharma, S.B., 1995a. Lithostratigraphic...
  • Gupta, S.N., Arora, Y.K., Mathur, R.K., Iqballuddin, Prasad, B., Sahai, T.N., Sharma, S.B., 1995b. Structural Map of...
  • Cited by (0)

    View full text