1 Introduction
2 Risk literacy
3 Analyzing and modelling: classification methods
3.1 The normative approach
3.2 Naïve Bayes
3.3 Logistic regression
3.4 Fast and frugal classification trees
4 Comparison of methods
- Naïve Bayes: The prior probability P(D) and the evidence distributions \(P(E_{k}^{H}|D)\) and \(P(E_{k}^{H}|\overline{D})\) were estimated using the Beta-Binomial conjugate prior method. This method estimates the probability of an event as (r + 1)/(m + 2), where r is the number of previous trials on which the event occurred out of m total trials. This is a simple Bayesian estimation method. It has the advantage that it avoids estimating a probability as zero when there the event does not occur in the sample or 100% when the event occurs for every case in the sample. Each case was classified as “yes” if the posterior probability \(P(D|E)\) was greater than 0.5 and “no” otherwise.
- Logistic regression: We used a standard logistic regression method to estimate the regression coefficients from (2). Each case was classified as “yes” if the posterior probability \(P(D|E)\) was greater than 0.5 and “no” otherwise.
- CART: CART (Breiman 1984) is a method for building trees for classifying categorical variables or predicting numerical variables. It uses a collection of rules designed to maximize information gain from each split of the tree, with splits terminating at a leaf node when an additional split would yield no further information gain. CART trees are not necessarily fast and frugal.
- Fast and frugal trees with Zig-Zag rule: This method constructs the tree by using positive and negative cue validities. Positive validity is the proportion of cases with a positive outcome among all cases with a positive cue value. Negative validity is the proportion of cases with a negative outcome among all cases with a negative cue value. The Zig-Zag method alternates between “yes” and “no” exits at each level, choosing according to the cue with the greatest positive (for “yes”) or negative (for “no”) validity among the cues not already chosen.
- Fast and frugal trees with MaxVal rule: This method also uses positive and negative cue validities. It begins by ranking the cues according to the higher of each cue’s positive or negative validity. It then proceeds according to this ranking, applying the cues in order and exiting in the positive (negative) direction if the positive (negative) validity of the cue is higher. Ties in this process are broken randomly.