Seeing is believing: Priors, trust, and base rate neglect

https://doi.org/10.1016/j.obhdp.2012.04.001Get rights and content

Abstract

Kahneman and Tversky (1973) described an effect they called ‘insensitivity to prior probability of outcomes’, later dubbed base rate neglect, which describes people’s tendency to underweight prior information in favor of new data. As probability theory requires that prior probabilities be taken into account, via Bayes’ theorem, the fact that most people fail to do so has been taken as evidence of human irrationality and, by others, of a mismatch between our cognitive processes and the questions being asked (Cosmides & Tooby, 1996). In contrast to both views, we suggest that simplistic Bayesian updating using base rates is not necessarily rational. To that end, we present experiments in which base rate neglect is often the right strategy, and show that people’s base rate usage varies systematically as a function of the extent to which the data that make up a base rate are perceived as trustworthy.

Highlights

► Is base rate neglect a rational strategy? ► Base rate neglect is altered by experimental manipulations of data trustworthiness. ► Modeling indicates that base rate neglect is too simple a concept. ► Base rates are just one aspect of a person’s prior data. ► Expecting people to use only the base rate as their prior for Bayesian updating is naive.

Introduction

In the closing remarks to A Philosophical Essay on Probabilities, Laplace (1814/1951) argues that “the theory of probabilities is at bottom only common sense reduced to calculus; it makes us appreciate with exactitude that which exact minds feel by a sort of instinct without being able ofttimes to give a reason for it”. Within probability theory, Bayes’ rule provides the mechanism by which a set of prior beliefs can be updated in light of evidence, as follows: given a hypothesis, h, which we believe has some prior probability of being correct P(h), if we then observed some data, x, Bayes’ theorem tells us how to find P(h|x), the posterior probability that h is true given that we have now seen x,P(h|x)=P(x|h)P(h)P(x).

As to whether Laplace’s claim provides a plausible account of human reasoning, one of the principal sources of discussion is base rate neglect, a phenomenon that seems to contradict the assertion that analytic probabilities are merely formalized versions of people’s intuitions about chance. The general finding is that, when people are provided with prior information (in the form of a base rate) along with new evidence, they typically weight the evidence provided by the new data far more heavily than that provided by the base rates (Kahneman & Tversky, 1973). This tendency to downgrade the value of the prior relative to the likelihood is taken to imply that: firstly, Bayes’ theorem does not provide a complete account of the reasoning employed by people (Villejoubert & Mandel, 2002); and, secondly, that people are therefore suboptimal or biased in their judgments, and may be regarded as acting irrationally. Note, however, that there are two distinct claims here. Clearly, underweighting the base rate information will lead people to make judgments that differ from those provided by a simplistic application of Eq. (1). The charge of irrationality, however, is a stronger claim, and somewhat more questionable.

Traditional approaches to the study of human decision making have tended to assume that rational behavior is best operationalized in terms of the strict adherence to some optimal strategy calculated by the researcher in advance (as is the case in most uses of expected utility theory). Any deviations from this researcher-specified strategy are then deemed to be evidence of irrational behavior. The major problem with this approach is that the manner in which these optimal strategies are designed is often extremely impractical – most notably, no consideration is given to the costs associated with time spent and computations performed. As argued by Todd and Gigerenzer (2000), it is by no means clear that a rational actor should, in fact, expend a great deal of time and effort in computing exact solutions to complicated problems, especially when fast and simple approximations are available.

This accords well with observations such as those made by McKenzie (2003), who argues that rational models should, properly, be regarded as theories but not standards of behavior. This, it is argued, is because apparent errors observed in laboratory tasks actually tend to result from participants’ use of strategies that deliver good results in real world tasks. Thus, while such strategies can be regarded as “irrational” within the context of the specific task, it might in fact be the case that the strategies that are optimal in the experiment would in fact hinder performance in real life.

Guided by these ideas, we consider the question of base rate neglect with respect to how people should appropriately weight base rates and novel information in order to make predictions in real environments. To do so, we present two experiments (in three parts) that manipulate the quality of different sources of data presented to people. In doing so, we depart somewhat from the classic base rate neglect approach; specifically, we design scenarios that minimize the potential computational problems (as explained below) by making explicit what information needs to be aggregated and, instead, manipulate the apparent quality of the data. The results suggest that the strength of base rate neglect effects can be systematically manipulated by altering the trustworthiness of the data – while some people display some base rate neglect, the majority of our participants made decisions in a fashion consistent with assigning different levels of trust to different sources of evidence.

Section snippets

The existence and mitigation of base rate neglect

The original characterization of base rate neglect (Kahneman & Tversky, 1973) was simply that people tended to rely on a representativeness heuristic when making probability judgments while neglecting (i.e., underweighting) base rate information. That is, they would assign a high probability to a person being, for example, an engineer if the description they were provided sounded, stereotypically, like an engineer – while paying little attention to how many or few engineers were in the group of

Experiment 1

As an initial examination, we consider the impact of varying the location, age, source and quantity of the data that provides a base rate. Motivated in part by Bar-Hillel’s (1980) manipulation of the distinction between ‘coincidental’ and ‘relevant’ base rates, we look at four distinct factors that could, each, affect the perceived degree of relevance of the base rate to the outcome being estimated. In order to focus specifically on these issues, we depart from many traditional base rate

Participants

Participants were 20 university students and members of the general public, 10 males and 10 females, with a mean age of 30.4 (SD = 12.1). Each was paid for their participation with a $10 bookstore voucher.

Experimental design

As noted above, the scenarios used in our experiment were designed to maximize the extent to which people recognize the need to combine both sources of information, by explicitly placing the base rate data on a scale commensurate with a second source of evidence. To do so, both sources of

Method

Experiment 1B aimed to expand on the three factors that contributed to the cover story in Experiment 1A. The design of the experiment was the same as for Experiment 1A and was, in fact, conducted simultaneously – using the same 20 participants, with the conditions intermixed with those used in the first study. In the Experiment 1B scenarios, however, the “base rate” was fixed at 75% using a sample of size 20 (i.e., 15 hits), and the new data always based on a sample size of 4 with a single hit,

Method

Experiment 2 was conducted to address two issues regarding the design of the previous experiment. The first was that the results of Experiments 1A and 1B might have been affected by the use of a within-subjects design, which previous research has shown to increase the perceived salience of base rates and thus reduce the level of base rate neglect (see, e.g., Birnbaum and Mellers, 1983, Stolarz-Fantino et al., 2006). The second was to compare scenarios containing base rates explicitly indicated

Modeling individual responses

The analyses presented in the previous sections focused primarily on how presenting various different markers of trust (data age, location, source and quantity) systematically alters the average of the judgements given by participants. Implicitly, this analysis relies on the assumption that individual participants all use the same strategy: if everyone reports a number described by E[θ] + noise, then averaging helps to remove the noise. As an initial analysis of the data, this approach is

General discussion

As noted earlier, our base rate neglect paradigm differs significantly from that commonly used and, as a result, effects commonly argued to affect the magnitude of base rate neglect – such as the salience and representativeness of novel information – have been (deliberately) restricted. However, even in the absence of these factors, the base rate tended to be neglected in favor of newer (but completely commensurate) information, in a manner consistent with our view of base rate neglect as a

References (36)

  • J.C. Baird et al.

    Relative frequencies of numerical responses in ratio estimation

    Perception and Psychophysics

    (1970)
  • Bernoulli, J. (1713). Ars Conjectandi. Basilea:...
  • M.H. Birnbaum et al.

    Bayesian inference. Combining base rates with opinions of sources who vary in credibility

    Journal of Personality and Social Psychology

    (1983)
  • A.S. Bryk et al.

    Hierarchical linear models

    (1992)
  • K. Fiedler

    Beware of samples! A cognitive-ecological sampling approach to judgment biases

    Psychological Review

    (2000)
  • G. Gigerenzer

    On narrow norms and vague heuristics: A reply to Kahneman and Tversky (1996)

    Psychological Review

    (1996)
  • G. Gigerenzer et al.

    How to improve Bayesian reasoning without instruction: Frequency formats

    Psychological Review

    (1995)
  • G. Gigerenzer et al.

    Simple heuristics that make us smart

    (1999)
  • Cited by (37)

    • Schematic information influences memory and generalisation behaviour for schema-relevant and -irrelevant information

      2022, Cognition
      Citation Excerpt :

      If participants' location responses were influenced by assessing the relative probability of having studied a word-location association from a given semantic category, we would expect to observe an avoidance effect in the non-clustered condition. This base-rate neglect proposal (ignoring the absolute density of words in a given location) is a well-documented bias in the literature (Hawkins, Hayes, Donkin, Pasqualino, & Newell, 2015; Welsh & Navarro, 2012; Wolfe, 2007). It could potentially drive the avoidance effect if either implicit or explicit categorisation of the words were occurring (as would be necessary for the mutual exclusivity bias) or if participants were not categorising but were sensitive to the semantic distances between individual words.

    • Selective sampling and inductive inference: Drawing inferences based on observed and missing evidence

      2019, Cognitive Psychology
      Citation Excerpt :

      One factor that can affect the integration of base rates with other relevant problem components is their relative salience. The extent to which people use base rate as opposed to individuating details in lawyer-engineer problems for example, is related to the amount of detail provided about each type of information (cf. Koehler, 1996; Welsh & Navarro, 2012). In the typical problem where base rate neglect is observed, base rates are mentioned only briefly while a rich description of individual features is provided.

    • A pessimistic view of optimistic belief updating

      2016, Cognitive Psychology
      Citation Excerpt :

      The statistical artifacts identified by Harris and Hahn (2011) would exert no effect if participants felt they had no individuating knowledge with which to distinguish themselves from the base rate (the average person’s risk). The same is true if they were entirely unaware of base rates, or failed to incorporate them into their own estimates of personal risk as has sometimes been claimed (e.g., Kahneman & Tversky, 1973; but see also Cosmides & Tooby, 1996; Welsh & Navarro, 2012). The data from Experiments 2–4 make clear that this is not the case.

    View all citing articles on Scopus
    View full text