Combining uncertainty and imprecision in models of medical diagnosis
Introduction
Uncertainty and imprecision are important concepts of medical knowledge. A symptom is an uncertain indication of a disease as it may or may not occur with the disease. Thus, a measure of uncertainty should be used to estimate the disease risk when the symptom is observed. Linguistic statements, for instance “high fever” or “overweight”, are in common use when describing symptoms. A measure of imprecision is advantageous for their representation. Uncertainty characterizes a relation between symptoms and diseases, while imprecision is associated with the symptom representation.
Uncertainty and imprecision have been separately considered in a variety of problems of diagnosis support. Let us begin with score tests (called indices in medical handbooks) that are used to estimate the disease risk. The scores are assigned to health parameters and an algebraic sum of scores is treated as a measure of the risk of disease. Although such evaluation is far from being perfect, it is apparently commonly used in practice. Yet, the algebraic sum changes when a new symptom is added. In this case statistical investigations are required, as the previous score may be wrong for the new enlarged set of symptoms. However, the complexity of this investigation increases along with the final number of symptoms. Therefore, the set of symptoms of the medical index cannot be adapted when new diagnostic procedures are implemented or when laboratory tests are changed.
Typically, we consider the formalism of probability [22], [31] in estimation of the disease risk. One needs to be cognizant, however, that the use of this theory calls for a number of assumptions that need to be satisfied when considering probability calculus. For instance, a priori probability of the disease is an evident challenge. Experts often estimate this probability to avoid expensive investigations and save time. Difficulties in the determination of a priori and conditional probabilities may result in incorrect conclusions. Besides, it is questionable if Bayesian reasoning is an adequate model of diagnostic inference as discovering the expert’s beliefs is a hard task [36]. Even if the Bayesian dependences have been specified, the calculation of probabilities could be numerically complex. Simplification of calculations [22] and the iterative Bayes formula [14] are used to cope with practical problems. Yet, such modifications influence probability values, so e.g., successive symptoms have a decreasing impact on the final probability of a disease.
Fuzzy sets are often applied to medical reasoning with an intent to model both uncertainty of a diagnosis and imprecision of symptoms. Diagnosis support systems operate on rules with fuzzy premises, which represent imprecise symptoms. During inference fuzzy relations or implications are used, so conclusions are also represented in the form of fuzzy sets. The majority of problems specific to the probability theory implementation could be avoided, however this may lead to the effect where the significance of a single symptom for the diagnosis becomes overestimated. If the final conclusion is obtained as a maximum of rule consequents, it is practically determined by the consequent equivalent to the rule with the most reliable antecedent. A single symptom hardly ever prompts a solution, so the result could become counterintuitive. When a different aggregation operator is chosen, for instance it is the algebraic conorm S(x,y) = x + y − xy, the final conclusion is influenced by several rule consequents. Yet, the symptom with the strongest confirmation remains decisive for the diagnosis. To avoid this possible shortcoming, we may wish to assign weights to the individual rules, cf. [16], [23], [31].
It might be concluded that the abovementioned methods do not adequately model the diagnostic inference. Still, some of their features should maintain in a model of the diagnosis. The objective of the study is to find a model of the diagnosis in which disease risk changes monotonously with gradually appearing symptoms. At the same time, it should be estimated whether the symptoms match patient’s state precisely enough to be included in inference. The disease uncertainty can be represented in the framework of the Dempster–Shafer theory of evidence. A focal element defined in the theory can be described by a fuzzy set. Thus, an extension of the Dempster–Shafer theory for the fuzzy focal elements can create a basis for the representation of uncertainty and imprecision of the diagnosis. The belief measure defined in the theory can be used for evaluation of diagnostic hypotheses and for determination of the final conclusion. Beliefs of competitive hypotheses can be calculated and next compared. The diagnosis with the greatest belief becomes the final conclusion. This apparently simple model of diagnosis involves many theoretical and implementation problems. Some of them will be discussed in this study.
Classical methods of medical diagnosis modeling are presented in Section 2. An approach to modeling alternative to the existing methods is proposed in Section 3. This section offers an interpretation of the Dempster–Shafer theory and links it to fuzzy sets. It also provides a representation of uncertainty and imprecision as well as includes suggestions for observation and knowledge matching. In Section 4 an algorithm of diagnosis modeling is put forward. The algorithms rely on the knowledge represented by rules to infer the final conclusion and its belief value. The next section comprises experimental studies. Methods suggested in previous sections are tested for data of thyroid gland diseases available on the Internet. The data origin makes it possible both to refer to other authors’ research work and to verify the computations. The results of calculations are presented in Section 6 and discussed in Section 7. The last section of the study provides final conclusions.
Section snippets
Diagnostic rule modeling—probabilistic and fuzzy approaches
When dealing with uncertainty quantification of rule-based models [5], [14], [20], [31] of medical diagnosis, we exploit two basic approaches: based on the formalism of probability, e.g., [14] and fuzzy sets [5]. A combination of these two has been discussed both for the classical definition of probability and for belief and plausibility measures defined in the Dempster–Shafer theory [29]. For instance, probabilistic sets have been proposed to model membership functions in terms of random
Belief and plausibility of medical diagnosis
Shafer has introduced a theory of belief functions [29] on the basis of a generalization of Bayes conditional probability performed by Dempster [10]. The generalization consists of a conditioning rule and a combination rule [30]. Shafer has defined focal elements, i.e. predicates with information about their truth. Therefore, the joint theory is called the Dempster–Shafer theory (DST). The DST is meant to avoid classical conditional probability limitations in the combination of evidence [31].
Algorithmic considerations
Diagnosis elements described in the previous sections compose a diagnostic model that can be implemented in an algorithm of the diagnosis support. The algorithm models two stages of the diagnosis: gathering knowledge and inference about a specific case (patient). Knowledge consists of expert rules and the BPA. A contribution of an expert to the knowledge base creation can be limited to selection and classification of training data. If the expert cannot help in further works, focal elements have
Experimental data
During experiments data available in ftp.ics.uci.edu/pub/machine-learning-databases/thyroid-disease, files new-thyr.∗ have been used. This database concerns thyroid gland diseases and includes diagnosis (v1) and 5 measurable medical parameters (v2, … , v6). They are the following:
- v1:
diagnosis, which can be euthyroidism (health—H), hyperthyroidism (D1) and hypothyroidism (D2);
- v2:
T3-resin uptake test (a percentage);
- v3:
total serum thyroxin as measured by the isotopic displacement method;
- v4:
total serum
Selection of the membership function
The evaluation of the presented algorithm is based on the percentage of misdiagnosed test cases. A test set includes cases with the same diagnosis (identical v1 values). Belief measures of three possible diagnoses are calculated for the test set. A case is properly classified if Bel of the diagnosis indicated a priori by v1 has the greatest value. If belief measure values of two diagnoses including the correct one are equal, the case is considered as wrongly classified.
Calculations of the
Discussion
The proposed model of medical diagnosis considers simultaneously several medical hypotheses and indicates the final diagnosis as the hypothesis of the greatest belief. The reasoning process is very clear and easy to understand by physicians who are not experts in the performance of decision support systems. An algorithm that is understood stands a bigger chance of approval, so this is an important feature of the proposed method.
In the present work the empirical distribution of the training data
Conclusions
The present study proposes the model of medical diagnosis and the algorithm of its implementation in diagnosis support. In the model uncertainty and imprecision of symptoms are separately represented. Their measures are combined at the stage of the final conclusion. This makes the reasoning effective and clear for physicians. Uncertainty of diagnosis is described by the belief measure defined in the Dempster–Shafer theory, while imprecision of symptoms is modeled by fuzzy sets. Thus, the
References (39)
A new approach to approximate reasoning using a fuzzy logic
Fuzzy Sets and Systems
(1979)- et al.
The Dempster–Shafer theory of evidence: an alternative approach to multicriteria decision modelling
Omega
(2000) - et al.
Knowledge acquisition in the fuzzy knowledge representation framework of a medical consultation system
Artificial Intelligence in Medicine
(2004) - et al.
A comparison of Bayes, Dempster–Shafer and endorsement theories for managing knowledge uncertainty in the context of land cover monitoring
Computers, Environment and Urban Systems
(2004) Modelling vague beliefs using fuzzy-valued belief structures
Fuzzy Sets and Systems
(2000)- et al.
Computer aided fuzzy medical diagnosis
Information Sciences
(2004) - et al.
Fuzzy expert system with double knowledge base for ultrasonic classification
Expert Systems with Applications
(2001) Dynamic and static approaches to clinical data mining
Artificial Intelligence in Medicine
(1999)Uncertainty modeling and decision support
Reliability Engineering and System Safety
(2004)Fuzzy sets as a basis for a theory of possibility
Fuzzy Sets and Systems
(1978)
Toward a generalized theory of uncertainty (GTU)—an outline
Information Sciences
Pattern Recognition with Fuzzy Objective Function Algorithms
On the Dempster–Shafer evidence theory and non-hierarchical aggregation of belief structures
IEEE Transactions on Systems, Man and Cybernetics
Comparison of multivariate discrimination techniques for clinical data-application to the thyroid functional state
Methods of Information in Medicine
On distribution function description of probabilistic sets and its application in decision making
Fuzzy Sets and Systems
Fuzzy and Neuro-fuzzy Intelligent Systems
A generalisation of Bayesian inference
Journal of the Royal Statistical Society
A neural network classifier based on Dempster–Shafer theory
IEEE Transactions on Systems, Man and Cybernetics
The Dempster–Shafer theory of evidence
Cited by (105)
An ensemble classifier through rough set reducts for handling data with evidential attributes
2023, Information SciencesThe interactive fuzzy linguistic term set and its application in multi-attribute decision making
2022, Artificial Intelligence in MedicineA novel method for classification of BCI multi-class motor imagery task based on Dempster–Shafer theory
2019, Information SciencesGaussian process approach for metric learning
2019, Pattern RecognitionLiver fibrosis diagnosis support using the Dempster–Shafer theory extended for fuzzy focal elements
2018, Engineering Applications of Artificial IntelligenceExtracting easily interpreted diagnostic rules
2018, Information Sciences