Skip to main content

Bayesian Models in Cognitive Neuroscience: A Tutorial

  • Chapter
  • First Online:
An Introduction to Model-Based Cognitive Neuroscience

Abstract

This chapter provides an introduction to Bayesian models and their application in cognitive neuroscience. The central feature of Bayesian models, as opposed to other classes of models, is that Bayesian models represent the beliefs of an observer as probability distributions, allowing them to integrate information while taking its uncertainty into account. In the chapter, we will consider how the probabilistic nature of Bayesian models makes them particularly useful in cognitive neuroscience. We will consider two types of tasks in which we believe a Bayesian approach is useful: optimal integration of evidence from different sources, and the development of beliefs about the environment given limited information (such as during learning). We will develop some detailed examples of Bayesian models to give the reader a taste of how the models are constructed and what insights they may be able to offer about participants’ behavior and brain activity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In fact, the probability of each location given hearing and vision can only be obtained by multiplication if the variance in the two probability density functions is independent. In this case, we are talking about uncertainty that arises from noise in the sensory systems, which we can safely assume is independent between vision and hearing.

  2. 2.

    In all the examples and exercises given here, we obtain an approximate solution by evaluating p(x) for discrete values of \((\mu, {\sigma^2})\). In the continuous case, Eq. 9.3 would become:

    \(p(x) = \int {d\mu } \int {d{\sigma^2}} [p(x|x\sim N(\mu, {\sigma^2}))\times p(x\sim N(\mu, {\sigma^2})|{x_{1i}})]\)

References

  1. Bayes T (1763) An essay towards solving a problem in the doctrine of chances. Phil Trans 53:370–418

    Google Scholar 

  2. Behrens TE, Woolrich MW, Walton ME, Rushworth MF (2007) Learning the value of information in an uncertain world. Nat Neurosci 10:1214–1221

    Article  CAS  PubMed  Google Scholar 

  3. Chater N, Oaksford M (eds) (2008) The probabilistic mind: Prospects for Bayesian cognitive science. Oxford University Press, Oxford

    Chapter  Google Scholar 

  4. Courville AC, Daw ND, Touretzky DS (2006) Bayesian theories of conditioning in a changing world. Trends Cogn Sci 10:294–300

    Article  PubMed  Google Scholar 

  5. Cox RT (1946) Probability, frequency and reasonable expectation. Am J Phys 14:1–13

    Article  Google Scholar 

  6. Dayan P, Kakade S, Montague PR (2000) Learning and selective attention. Nat Neurosci 3:1218–1223

    Article  CAS  PubMed  Google Scholar 

  7. Ernst MO, Banks MS (2002) Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415:429–433

    Article  CAS  PubMed  Google Scholar 

  8. Gregory R (1966) Eye and brain. Princeton University Press, Princeton

    Google Scholar 

  9. Jacobs RA (1999) Optimal integration of texture and motion cues to depth. Vis Res 39:3621–3629

    Article  CAS  PubMed  Google Scholar 

  10. Knight FH (1921) Risk, uncertainty and profit. Hart, Schaffner and Marx, Boston

    Google Scholar 

  11. Körding KP, Wolpert DM (2006) Bayesian decision theory in sensorimotor control. Trends Cogn Sci 10:319–326

    Article  PubMed  Google Scholar 

  12. MacKay DJC (2003) Information theory, inference, and learning algorithms. Cambridge University Press, Cambridge

    Google Scholar 

  13. Mars RB et al (2008) Trial-by-trial fluctuations in the event-related electroencephalogram reflect dynamic changes in the degree of surprise. J Neurosci 28:12539-12545

    Article  CAS  PubMed  Google Scholar 

  14. McGrayne SB (2011) The theory that would not die: How Bayes’ rule cracked the enigma code, hunted down Russian submarines, and emerged triumphant from two centuries of controversy. Yale University Press, New Haven

    Google Scholar 

  15. O’Reilly JX (2013) Making predictions in a changing world-inference, uncertainty, and learning. Front Neurosci 7:105

    PubMed Central  PubMed  Google Scholar 

  16. O’Reilly JX, Mars RB (2011) Computational neuroimaging: Localising Greek letters? Trends Cogn Sci 15:450

    Article  PubMed  Google Scholar 

  17. O’Reilly JX, Jbabdi S, Behrens TE (2012) How can a Bayesian approach inform neuroscience? Eur J Neurosci 35:1169–1179

    Article  PubMed  Google Scholar 

  18. Payzan-LeNestour E, Bossaerts P (2011) Risk, unexpected uncertainty, and estimation uncertainty: Bayesian learning in unstable settings. PLoS Comp Biol 7:e1001048

    Article  CAS  Google Scholar 

  19. Posner MI, Snyder CRR, Davidson BJ (1980) Attention and the detection of signals. J Exp Psychol Gen 109:160–174

    Article  CAS  Google Scholar 

  20. Real LA (1991) Animal choice behavior and the evolution of cognitive architecture. Science 253:980–986

    Article  CAS  PubMed  Google Scholar 

  21. Robbins H (1952) Some aspects of the sequential design of experiments. Bull Amer Math Soc 58:527–535

    Article  Google Scholar 

  22. Segall MH, Campbell DT, Herskovits MJ (1963) Cultural differences in the perception of geometric illusions. Science 139:769–771

    Article  CAS  PubMed  Google Scholar 

  23. Silver N (2012) The signal and the noise: Why most predictions fail but some don’t. Penguin, New York

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jill X. O’Reilly .

Editor information

Editors and Affiliations

Appendices

Appendix A: One-Armed Bandit Model

We can write down the generative model, by which the rewarded action (A or B) is selected as follows:

p(A rewarded on trial i)~ Bernoulli(qi)

$$ {q_i} = \left\{{\begin{matrix} {{q_{i-1}}}&{if~J = 0} \\ {rand(0,1)}&{if~J = 1}\end{matrix}}\right. $$

… where J is a binary variable which determines whether there was a jump in the value of q between trial i-1 and trial i; J itself is determined by

$$ J\tilde{\ }~{\text{ }}Bernoulli{(}v{)} $$

… where v is the probability of a jump, e.g. if a jump occurs on average every 15 trials, v =\({{}^{1}\!/\!{}_{{15}}}.\)

Then we can construct a Bayesian computer participant which infers the values of q and v on trial i as follows:

$$ p(q,v\left|{{x_{1:i}}) = p({x_i}\left|{{q_i}}\right.,v)p({q_i},v)}\right. $$

where the prior at trial i, \(p({q_i},v),\) is given by

$$ p({{q_i},v})=p({q_i}|{q_{i-1}},v)~p({q_{i-1}},v|{x_{1:i-1}}) $$

and the transition function \(p({q_i}\left|{{q_{i-1}},v)}\right.\) is given by

$$p({{q}_{i}}\left| {{q}_{i-1}},v)=(1-v) \right.{{q}_{i-1}}+v\left( \frac{1}{Uniform\left( 0,1 \right)} \right)$$

Exercises

Exercise 1. Look at Fig. 9.5. How do you interpret the shadow on the surface shapes? Most people see the left hand side bumps as convex and the right hand bumps as concaves. Can you explain why that might be, using your Bayesian perspective? Hint: think of the use of priors.

Fig. 9.5
figure 5

Convex or concave?

Exercise 2. In Fig. 9.4 we saw some interesting behavior by a Bayesian learner. For instance, at point c the model very quickly changed its belief of an environment where left was rewarded into one where right was rewarded. One important goal of model-based cognitive neuroscience is to link this type of changes probability distributions to observed neural phenomena. Can you come up with some phenomena that can be linked with changes in the model’s parameters?

Exercise 3. In this final exercise we will ask you to construct a simple Bayesian model. The solutions include example Matlab code, although they are platform independent. Consider the following set of observations of apple positions x, which Isaac made in his garden:

i

xi

1

63

2

121

3

148

4

114

5

131

6

121

7

90

8

108

9

76

10

126

  1. 1.

    Find the mean, E(x), and variance, E(x 2 )–E(x) 2, of this set of observations using the formulae

$$ E(x) = \frac{1}{n}\sum_i {{x_i}}$$
$$ E({x^2}) = \frac{1}{n}{\sum_i {{x_i}}^2} $$
  1. 2.

    If I tell you that these samples were drawn from a normal distribution, x~N(μ, σ 2 ) how could you use Bayes’ theorum to find the mean and variance of x? Or more precisely, how could you use Bayes’ theorem to estimate the parameters, μ and σ 2, of the normal distribution from which the samples are drawn?

Hint: remember from the text that we can write

$$ p(x\tilde{\ }N(\mu,{\sigma^2})\left|{{x_1}\ldots {x_n}}\right.)\propto p({x_1}\ldots {x_n}\left|{x\tilde{\ }N(\mu,{\sigma^2}))p(x\tilde{\ }N(\mu,{\sigma^2}))}\right. $$

…where the likelihood function, \(p({x_i}\left|{x\tilde{\ }N(\mu,\sigma))}\right.\), is given by the standard probability density function for a normal distribution:

$$ p(x) = \frac{1}{{\sqrt {2\pi {\sigma^2}}}}\exp \left({\frac{{-{{(x-\mu)}^2}}}{{2{\sigma^2}}}}\right) $$

…and you can assume:

  • 1. The prior probability p(x~N(μ, σ 2 )) is equal for all possible values of μ; and σ 2,and

  • 2. The observations are independent samples such that p(xi∩xj) = p(xi)p(xj) for all pairs of samples {xi, xj}.

Now use MATLAB to work out the posterior probability for a range of pairs of parameter values μ and σ 2, and find the pair with the highest joint posterior probability. This gives a maximum likelihood estimate for μ and σ 2.

  1. 3.

    Can you adapt this model to process each data point sequentially, so that the posterior after observation i becomes the prior for observation i + 1?

Hint: remember from the text that (assuming the underlying values of μ and σ 2 cannot change between observations), we can write:

$$ p(x\tilde{\ }N(\mu,{\sigma^2})\left|{{x_1}\ldots }\right.{x_i})\propto p({x_i}\left|{x\tilde{\ }N(\mu,{\sigma^2})}\right.)p(x\tilde{\ }N(\mu,{\sigma^2})\left|{{x_1}\ldots {x_{i-1}})}\right. $$

… where the prior at trial i, \(p(x\tilde{\ }N(\mu,{\sigma^2})\left|{{x_1}}\right.\ldots {x_{i-1}})\) is the posterior from trial i-1.

  1. 4.

    If you have done parts 2 and 3 correctly, the final estimates of {μ, σ 2 } should be the same whether you process thedata points sequentially, or all at once. Why is this?

Further Reading

  1. 1.

    McGrayne [14] provides an historical overview of the development of Bayes’ theorem, its applications, and its gradual acceptance in the scientific community;

  2. 2.

    Daniel Wolpert’s TED talk (available at http://www.ted.com/talks/daniel_wolpert_the_real_reason_for_brains.html) provides a nice introduction in to consequences of noise in neural systems and the Bayesian way of dealing with it;

  3. 3.

    O’Reilly [15] discusses Bayesian approaches to dealing with changes in the environment and how different types of uncertainty are incorporated into Bayesian models and dealt with in the brain.

  4. 4.

    Nate Silver’s book The signal and the noise [23] contains some nice example about how humans make predictions and establish beliefs. Silver advocates a Bayesian approach to dealing with uncertainty. It served him very well in the 2012 USA presidential elections, when he correctly predicted for each of the 50 states whether they would be carried by Obama or Romney.

  5. 5.

    David MacKay’s book Information theory, inference, and learning algorithms [12] is a much more advanced treatment of many of the principle of Bayesian thinking. It is available for free at http://www.inference.phy.cam.ac.uk/itprnn/book.html.

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

O’Reilly, J., Mars, R. (2015). Bayesian Models in Cognitive Neuroscience: A Tutorial. In: Forstmann, B., Wagenmakers, EJ. (eds) An Introduction to Model-Based Cognitive Neuroscience. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2236-9_9

Download citation

Publish with us

Policies and ethics