Predicting carcinoid heart disease with the noisy-threshold classifier
Introduction
Bayesian networks have become a widely accepted formalism for reasoning under uncertainty by providing a concise representation of a joint probability distribution over a set of random variables [1]. This distribution is factorized according to an associated acyclic directed graph (ADG) that represents the independence structure between random variables. However, the construction of a Bayesian network that fully captures this independence structure for a realistic domain, has proven to be a difficult task. It requires either manual specification of the ADG by means of available expert knowledge, or large amounts of high-quality data when we resort to structure learning.
An alternative to the construction of an ADG that fully captures the independence structure that holds between variables within the domain, is to use a fixed or severely constrained graph topology for classification purposes. In the latter context we call a Bayesian network a Bayesian classifier. The use of Bayesian methods in medicine was first proposed by Ledley and Lusted in their classic 1959 paper [2], and one of the first successful implementations of Bayesian classifiers in medicine was De Dombal’s system for the diagnosis of acute abdominal pain [3]. The classifier that was used assumes independence of symptoms given the disease, and is known as the naive-Bayes classifier. Over the years, many different Bayesian classifier architectures have been proposed, and many of them focus on lifting the independence assumptions of the naive-Bayes classifier [4]. However, a standard technique such as logistic regression, which is used extensively in medicine, can also be interpreted in terms of a Bayesian classifier architecture (Fig. 1). Other examples of Bayesian classifier architectures can be found in refs. [5], [6], [7].
Although, typically, the actual joint probability distribution, and the joint probability distribution that is represented by the Bayesian classifier, differ considerably, this approach can still yield good results with respect to the classification task [8]. However, a weakness of this approach is that the ad-hoc restrictions that are placed on the underlying graph effectively reduces the Bayesian network to a black box model, making the relation between properties of the domain and classification outcome often difficult to understand. This is an undesirable property; especially in medicine, where ideally one wants to be able to interpret how the classification outcome (such as diagnosed disease or patient prognosis) relates to the available domain knowledge (its causes). The explanation of drawn conclusions is required to increase the acceptance of machine-learning techniques in practice [9], [10].
In this paper, we employ a novel Bayesian classifier, introduced in ref. [11], that facilitates this interpretation as it explicitly provides for a semantics in terms of cause and effect relationships [12]. This noisy-threshold classifier is based on a generalization of the well-known noisy-or model, which has already been used for the purpose of text classification in ref. [13]. In order to demonstrate the merits of the noisy-threshold classifier in a medical context, we apply the technique to the prediction of carcinoid heart disease(CHD); a serious condition that arises as a complication of certain neuroendocrine tumors [14]. We demonstrate that the noisy-threshold classifier performs competitively with state-of-the art classification techniques for this medically relevant problem. Furthermore, an expert physician at the Netherlands Cancer Institute (NKI) was consulted, and it is demonstrated how her knowledge concerning CHD relates to the parameters that were estimated for the noisy-threshold classifier.
This paper proceeds as follows. Section 2 introduces the necessary preliminaries and discusses the semantics of the noisy-threshold model, whereas Section 3 describes the medical problem. The use of the noisy-threshold model as a Bayesian classifier is discussed in Section 4. The results on the classification task and the medical interpretation by the expert physician is presented in Section 5. The paper is ended by some concluding remarks in Section 6.
Section snippets
Bayesian networks
Bayesian networks provide for a compact factorization of a joint probability distribution over a set of random variables by exploiting the notion of conditional independence [1]. Conditional independence can be represented by an acyclic directed graph (ADG) G consisting of vertices and arcs , and relies on the notion of d-separation [1]. Let G be an ADG and P a joint probability distribution over a set of random variables . We assume that there is a one-to-one correspondence
Carcinoid heart disease
Carcinoid tumors belong to the group of neuroendocrine tumors, which are known for the production of vasoactive agents in the presence of metastatic disease; usually hepatic (liver) metastases. Among these agents, serotonin is the most important agent, leading to the characteristic carcinoid syndrome of flushes and diarrhea. The other main characteristic feature of neuroendocrine tumors is the slow progression of most tumors if the histology shows a low-grade pattern [26].
Serotonin
Classifier construction
Construction of a noisy-threshold classifier (NTC) proceeds as follows. We first determine the cause variables and effect variable E that are used in the classifier. In the context of a classifier, the cause variables stand for the attributes and the effect variable stands for the class-variable. Secondly, we need to determine the positive states of the variables. In the CHD domain, the positive states are simply defined as the presence of attributes that affect the presence of the
Classification performance
Table 2 lists the classification accuracy for noisy-threshold classifiers to . The noisy-threshold classifier is selected, based on the validation set , and shows the best classification accuracy of on the test set . Note that this exceeds considerably the classification accuracy of 0.54 for the noisy-or classifier .
In order to test how well the NTC performs compared with the physician, and with the other classification algorithms that were discussed in Section
Conclusions
The noisy-threshold classifier is a novel type of classifier that has a well-defined semantics in terms of causes and effect. Due to the independence assumptions that are made by the classifier, parameters can be reliably estimated without needing to resort to huge amounts of data. This is an important feature since many domains are characterized by limited amounts of data, as discussed in ref. [39]. Learning Bayesian classifiers from data is to be contrasted with the construction of a full
Acknowledgements
This research was sponsored by the Netherlands Organization for Scientific Research (NWO) under grant numbers 612.066.201 and FN4556. We would like to thank the anonymous reviewers for their valuable comments.
References (41)
- et al.
An analysis of physician attitudes regarding computer-based clinical consultation systems
Comput Biomed Res
(1981) - et al.
A new look at causal independence
Parameter adjustment in Bayes networks. The generalized noisy OR-gate
Bayesian network modelling by qualitative patterns
Artif Intell
(2005)The area above the ordinal dominance graph and the area below the receiver operating characteristic graph
J Math Psychol
(1975)- et al.
Networks of probabilistic events in discrete time
Int J Approx Reason
(2002) - et al.
Nasonet, modeling the spread of nasopharyngeal cancer with networks of probabilistic events in discrete time
Artif Intell Med
(2002) Probabilistic reasoning in intelligent systems: networks of plausible inference
(1988)- et al.
Reasoning foundation of medical diagnosis: symbolic logic, probability, and value theory aid our understanding of how physicians reason
Science
(1959) - et al.
Computer aided diagnosis of acute abdominal pain
Br Med J
(1972)