Key Points
-
The ability of primates to sense the direction and speed (together, the velocity) of moving objects depends on an elaborate sequence of computations that have been investigated in psychological, biological and theoretical studies.
-
Two cortical areas are central to this process: area V1 (the primary visual cortex) and area MT (the middle temporal area). V1 neurons are thought to measure image velocity at high spatial resolution and feed the results to area MT, which then combines the inputs to compute the overall velocity of a moving pattern.
-
The integration by the MT neurons is needed to overcome the ambiguity of the local (point-wise) velocity samples taken by the V1 neurons. This ambiguity is called the aperture problem, although fundamentally the problem stems from the sampling mechanism and not from the aperture itself.
-
Several mathematical models have been proposed to explain how the visual system measures velocity at each point in the image (local-velocity sampling). These include: gradient models, which compute the spatial and temporal intensity derivatives at each point in the image; Reichardt models, which sense the delay in activation of two spatially offset sensors; and spatiotemporal energy (STE) models, which detect motion energy with linear space–time filters. These models are not entirely distinct and are in fact mathematically equivalent under certain conditions.
-
Most experimental evidence supports an STE-like mechanism. In particular, a subclass of V1 cells (simple cells) behave like linear space–time filters with characteristics that match those postulated by the STE models.
-
Other theories have addressed the manner of integration of local-velocity samples by MT neurons. One idea holds that each MT neuron sums input from V1 neurons that have filter characteristics that situate them on a common plane in frequency space. Because a moving object has all its spectral energy on a single plane in frequency space, and because the orientation of this plane corresponds to the object's velocity, this 'planar summation' by the MT neuron would have the effect of tuning it for a certain object velocity.
-
A wide range of neurophysiological evidence is consistent with the above planar summation model, but no evidence establishes it directly. Other possibilities exist: in particular, there could be nonlinear transformations early in the visual system that obviate the aperture problem by tracking unambiguous elements called features. Studies are needed to directly compare these two basic mechanisms.
-
The mechanisms of pattern-velocity computation remain incompletely understood. However, the history of interplay between theory and experiment in this area has narrowed the possibilities to a small number of formally distinct ideas.
Abstract
Computational neuroscience combines theory and experiment to shed light on the principles and mechanisms of neural computation. This approach has been highly fruitful in the ongoing effort to understand velocity computation by the primate visual system. This Review describes the success of spatiotemporal-energy models in representing local-velocity detection. It shows why local-velocity measurements tend to differ from the velocity of the object as a whole. Certain cells in the middle temporal area are thought to solve this problem by combining local-velocity estimates to compute the overall pattern velocity. The Review discusses different models for how this might occur and experiments that test these models. Although no model is yet firmly established, evidence suggests that computing pattern velocity from local-velocity estimates involves simple operations in the spatiotemporal frequency domain.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Felleman, D.J. & Van Essen, D. C. Distributed hierarchical processing in the primate cerebral cortex. Cereb. Cortex 1, 1–47 (1991). This important review addresses the compartmentalization of cortical visual functions.
Van Essen, D. C. Visual areas of the mammalian cerebral cortex. Annu. Rev. Neurosci. 2, 227–263 (1979).
DeYoe, E. A. & Van Essen, D. C. Concurrent processing streams in monkey visual cortex. Trends Neurosci. 11, 219–226 (1988).
Orban, G. A. in Extrastriate Cortex in Primates (eds Rockland, K. S., Kaas, J. H. & Peters, A.) 359–434 (Plenum, New York, 1997).
Maunsell, J. H. & Newsome, W. T. Visual processing in monkey extrastriate cortex. Annu. Rev. Neurosci. 10, 363–401 (1987).
Born, R. T. & Bradley, D. C. Structure and function of area MT. Annu. Rev. Neurosci. 28, 157–189 (2005).
Hubel, D. H. & Wiesel, T. N. Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195, 215–243 (1968).
Zeki, S. M. Functional organization of a visual area in the posterior bank of the superior temporal sulcus of the rhesus monkey. J. Physiol. 236, 549–573 (1974).
Hawken, M. J., Parker, A. J. & Lund, J. S. Laminar organization and contrast sensitivity of direction-selective cells in the striate cortex of the Old World monkey. J. Neurosci. 8, 3541–3548 (1988).
Shipp, S. & Zeki, S. The organization of connections between areas V5 and V1 in macaque monkey visual cortex. Eur. J. Neurosci. 1, 309–332 (1989).
Ponce, C. R., Lomber, S. G. & Born, R. T. Integrating motion and depth via parallel pathways. Nature Neurosci. 11, 216–223 (2008).
Reichardt, W. in Sensory Communication (ed. Rosenblith, W. A.) (Wiley, New York, 1961).
Horn, K. P. & Schunck, B. G. Determining optical flow. Artif. Intell. 17, 185–203 (1981).
Fennema, C. L. & Thompson, W. B. Velocity determination in scenes containing several moving images. Comput. Graphics Image Process. 9, 301–315 (1979). This was a key theoretical paper on the mathematical principles behind the aperture problem.
Adelson, E. H. & Bergen, J. R. The extraction of spatiotemporal energy in human and machine vision. Proc. Workshop Motion: Represent. Anal. 151–155 (1986).
Watson, A. B. & Ahumada, A. J. Model of human visual-motion sensing. J. Opt. Soc. Am. A 2, 322–341 (1985).
van Santen, J. P. & Sperling, G. Elaborated Reichardt detectors. J. Opt. Soc. Am. A 2, 300–321 (1985).
Adelson, E. H. & Bergen, J. R. Spatiotemporal energy models for the perception of motion. J. Opt. Soc. Am. A 2, 284–299 (1985).
Watson, A. B. & Ahumada, A. J. A look at motion in the frequency domain. NASA Tech. Memo. 84352 (1983).
Hubel, D. H. & Wiesel, T. N. Receptive fields of single neurones in the cat's striate cortex. J. Physiol. 148, 574–591 (1959). This was the first of a series of Nobel-prize-winning studies on the response selectivity of V1 neurons.
Shapley, R. & Lennie, P. Spatial frequency analysis in the visual system. Annu. Rev. Neurosci. 8, 547–583 (1985).
Reid, R. C., Soodak, R. E. & Shapley, R. M. Linear mechanisms of directional selectivity in simple cells of cat striate cortex. Proc. Natl. Acad. Sci. USA 84, 8740–8744 (1987).
Citron, M. C. & Emerson, R. C. White noise analysis of cortical directional selectivity in cat. Brain Res. 279, 271–277 (1983).
Mancini, M., Madden, B. C. & Emerson, R. C. White noise analysis of temporal properties in simple receptive fields of cat cortex. Biol. Cybern. 63, 209–219 (1990).
DeAngelis, G. C., Ohzawa, I. & Freeman, R. D. Spatiotemporal organization of simple-cell receptive fields in the cat's striate cortex. II. Linearity of temporal and spatial summation. J. Neurophysiol. 69, 1118–1135 (1993).
DeAngelis, G. C., Ohzawa, I. & Freeman, R. D. Spatiotemporal organization of simple-cell receptive fields in the cat's striate cortex. I. General characteristics and postnatal development. J. Neurophysiol. 69, 1091–1117 (1993).
Rust, N. C., Mante, V., Simoncelli, E. P. & Movshon, J. A. How MT cells analyze the motion of visual patterns. Nature Neurosci. 9, 1421–1431 (2006).
Movshon, J. A. & Newsome, W. T. Visual response properties of striate cortical neurons projecting to area MT in macaque monkeys. J. Neurosci. 16, 7733–7741 (1996).
Emerson, R. C., Bergen, J. R. & Adelson, E. H. Directionally selective complex cells and the computation of motion energy in cat visual cortex. Vision Res. 32, 203–218 (1992).
Touryan, J., Lau, B. & Dan, Y. Isolation of relevant visual features from random stimuli for cortical complex cells. J. Neurosci. 22, 10811–10818 (2002).
Rust, N. C., Schwartz, O., Movshon, J. A. & Simoncelli, E. P. Spatiotemporal elements of macaque v1 receptive fields. Neuron 46, 945–956 (2005).
Qian, N. & Andersen, R. A. Transparent motion perception as detection of unbalanced motion signals. II. Physiology. J. Neurosci. 14, 7367–7380 (1994).
Heeger, D. J. Model for the extraction of image flow. J. Opt. Soc. Am. A 4, 1455–1471 (1987).
Simoncelli, E. P. & Heeger, D. J. A model of neuronal responses in visual area MT. Vision Res. 38, 743–761 (1998). This paper provides a complete description of the S and H model.
Chubb, C., McGowan, J., Sperling, G. & Werkhoven, P. Non-Fourier motion analysis. Ciba Found. Symp. 184, 193–205 (1994).
Wilson, H. R., Ferrera, V. P. & Yo, C. A psychophysically motivated model for two-dimensional motion perception. Vis. Neurosci. 9, 79–97 (1992).
Noest, A. J. & van den Berg, A. V. The role of early mechanisms in motion transparency and coherence. Spat. Vis. 7, 125–147 (1993).
Pack, C. C., Livingstone, M. S., Duffy, K. R. & Born, R. T. End-stopping and the aperture problem: two-dimensional motion signals in macaque V1. Neuron 39, 671–680 (2003).
van den Berg, A. V. & Noest, A. J. Motion transparency and coherence in plaids: the role of end-stopped cells. Exp. Brain Res. 96, 519–533 (1993).
Wilson, H. R. & Kim, J. Perceived motion in the vector sum direction. Vision Res. 34, 1835–1842 (1994).
Pack, C. C., Berezovskii, V. K. & Born, R. T. Dynamic properties of neurons in cortical area MT in alert and anaesthetized macaque monkeys. Nature 414, 905–908 (2001).
Rubin, N. & Hochstein, S. Isolating the effect of one-dimensional motion signals on the perceived direction of moving two-dimensional objects. Vision Res. 33, 1385–1396 (1993).
Pack, C. C., Gartland, A. J. & Born, R. T. Integration of contour and terminator signals in visual area MT of alert macaque. J. Neurosci. 24, 3268–3280 (2004).
Kahlon, M. & Lisberger, S. G. Vector averaging occurs downstream from learning in smooth pursuit eye movements of monkeys. J. Neurosci. 19, 9039–9053 (1999).
Recanzone, G. H., Wurtz, R. H. & Schwarz, U. Responses of MT and MST neurons to one and two moving objects in the receptive field. J. Neurophysiol. 78, 2904–2915 (1997).
Priebe, N. J., Churchland, M. M. & Lisberger, S. G. Reconstruction of target speed for the guidance of pursuit eye movements. J. Neurosci. 21, 3196–3206 (2001).
Adelson, E. H. & Movshon, J. A. Phenomenal coherence of moving visual patterns. Nature 300, 523–525 (1982).
Movshon, J. A., Adelson, E. H., Gizzi, M. S. & Newsome, W. T. in Pattern Recognition Mechanisms (eds Chagas, C., Gattass, R. & Gross, C.) 117–151 (Vatican Press, Rome, 1985). This is one of the most heavily cited papers in visual neuroscience. It was the first to document the ability of MT cells to track whole-object motion.
Newsome, W. T., Britten, K. H. & Movshon, J. A. Neuronal correlates of a perceptual decision. Nature 341, 52–54 (1989).
Salzman, C. D., Britten, K. H. & Newsome, W. T. Cortical microstimulation influences perceptual judgements of motion direction. Nature 346, 174–177 (1990).
Britten, K. H., Shadlen, M. N., Newsome, W. T. & Movshon, J. A. The analysis of visual motion: a comparison of neuronal and psychophysical performance. J. Neurosci. 12, 4745–4765 (1992).
Salzman, C. D., Murasugi, C. M., Britten, K. H. & Newsome, W. T. Microstimulation in visual area MT: effects on direction discrimination performance. J. Neurosci. 12, 2331–2355 (1992).
Newsome, W. T. & Salzman, C. D. The neuronal basis of motion perception. Ciba Found. Symp. 174, 217–230 (1993).
Salzman, C. D. & Newsome, W. T. Neural mechanisms for forming a perceptual decision. Science 264, 231–237 (1994).
Groh, J. M., Born, R. T. & Newsome, W. T. A comparison of the effects of microstimulation in area MT on saccades and smooth pursuit eye movements. Invest. Ophthalmol. Vis. Sci. 37, 5472 (1996).
Britten, K. H., Newsome, W. T., Shadlen, M. N., Celebrini, S. & Movshon, J. A. A relationship between behavioral choice and the visual responses of neurons in macaque MT. Vis. Neurosci. 13, 87–100 (1996). This paper is representative of several seminal papers by the Newsome laboratory that unequivocally linked MT firing rates to visual-motion percepts.
Batista, A. P. & Newsome, W. T. Visuo-motor control: giving the brain a hand. Curr. Biol. 10, R145–R148 (2000).
Liu, J. & Newsome, W. T. Correlation between MT activity and perceptual judgments of speed. Soc. Neurosci. Abstr. 29, 438.4 (2003).
Albright, T. D. Direction and orientation selectivity of neurons in visual area MT of the macaque. J. Neurophysiol. 52, 1106–1130 (1984).
Rodman, H. R. & Albright, T. D. Coding of visual stimulus velocity in area MT of the macaque. Vision Res. 27, 2035–2048 (1987).
Okamoto, H. et al. MT neurons in the macaque exhibited two types of bimodal direction tuning as predicted by a model for visual motion detection. Vision Res. 39, 3465–3479 (1999).
Perrone, J. A. & Thiele, A. Speed skills: measuring the visual speed analyzing properties of primate MT neurons. Nature Neurosci. 4, 526–532 (2001).
Stoner, G. R. & Albright, T. D. Neural correlates of perceptual motion coherence. Nature 358, 412–414 (1992).
Snowden, R. J., Treue, S., Erickson, R. G. & Andersen, R. A. The response of area MT and V1 neurons to transparent motion. J. Neurosci. 11, 2768–2785 (1991).
Priebe, N. J., Cassanello, C. R. & Lisberger, S. G. The neural representation of speed in macaque area MT/V5. J. Neurosci. 23, 5650–5661 (2003).
Lisberger, S. G., Priebe, N. J. & Movshon, J. A. Spatio-temporal frequency tuning of neurons in macaque V1. Soc. Neurosci. Abstr. 29, 484.8 (2003).
Priebe, N. J., Lisberger, S. G. & Movshon, J. A. Tuning for spatiotemporal frequency and speed in directionally selective neurons of macaque striate cortex. J. Neurosci. 26, 2941–2950 (2006).
Mante, V. Testing Models of Cortical Area MT. Thesis, Inst. Neuroinformatics Univ. Zurich (2000).
Smith, A. T. & Edgar, G. K. Perceived speed and direction of complex gratings and plaids. J. Opt. Soc. Am. A 8, 1161–1171 (1991).
Pack, C. C. & Born, R. T. Temporal dynamics of a neural solution to the aperture problem in visual area MT of macaque brain. Nature 409, 1040–1042 (2001).
Smith, M. A., Majaj, N. J. & Movshon, J. A. Dynamics of motion signaling by neurons in macaque area MT. Nature Neurosci. 8, 220–228 (2005).
Stoner, G. R., Albright, T. D. & Ramachandran, V. S. Transparency and coherence in human motion perception. Nature 344, 153–155 (1990).
Stoner, G. R. & Albright, T. D. Motion coherency rules are form-cue invariant. Vision Res. 32, 465–475 (1992).
Stoner, G. R. & Albright, T. D. The interpretation of visual motion: evidence for surface segmentation mechanisms. Vision Res. 36, 1291–1310 (1996).
Levitt, J. B., Kiper, D. C. & Movshon, J. A. Receptive fields and functional architecture of macaque V2. J. Neurophysiol. 71, 2517–2542 (1994).
Tinsley, C. J. et al. The nature of V1 neural responses to 2D moving patterns depends on receptive-field structure in the marmoset monkey. J. Neurophysiol. 90, 930–937 (2003).
Guo, K., Benson, P. J. & Blakemore, C. Pattern motion is present in V1 of awake but not anaesthetized monkeys. Eur. J. Neurosci. 19, 1055–1066 (2004).
Acknowledgements
We are grateful to E. Adelson, R. Born, A. Clark, G. DeAngelis, J. A. Movshon, W. Newsome, C. Pack, N. Priebe, G. Purushothaman, P. Wallisch and H. Wilson for assistance. Supported by US National Institutes of Health grants R01-EY013138 and R01-NS40690-01A1.
Author information
Authors and Affiliations
Corresponding author
Glossary
- Spatiotemporal frequency
-
A three-dimensional frequency vector (ωx,ωy,ωt) that specifies spatial frequencies ωx and ωy and temporal frequency ωt. The physical counterpart is a moving sinusoidal grating.
- Band-pass filters
-
A type of linear filter that blocks low and high frequencies while allowing a certain range of intermediate frequencies to pass through.
- Simple cells
-
V1 neurons with essentially linear properties. They act as linear space–time filters that perform the first and most basic step of motion detection. Subsequent stages (performed by complex cells and MT cells) elaborate on the outputs of simple cells.
- Rectification
-
A sinusoidal wave (whatever its dimensions) oscillates symmetrically about a value of zero. If we take the absolute value of the negative parts, we obtain a full-wave rectified signal. If we set the negative parts to zero, we obtain a half-wave rectified signal.
- Quadrature pair
-
A pair of sinusoidal functions of the same dimension and frequency but with phases that differ by 90°. Notably, sine and cosine functions have a quadrature relationship to each other.
- Gabor function
-
A sinusoidal function multiplied by a Gaussian function. The sinusoid is said to be in a Gaussian 'envelope'.
- Least-squares sampling
-
Noisy measurements often have a basic trend, such as a mean or a linear slope. If the measurements are normally distributed, then the maximum-likelihood estimate of the underlying trend is obtained by minimizing the sum of squared differences between the samples and the estimated trend. This is least-squares sampling.
- Kernel
-
A weighting function, characteristic of a particular filter, that is used to convolve input to the system.
- Response saturation
-
The levelling off of a neuron's response (at some maximum value) as stimulus intensity increases.
- Gain normalization
-
In a population of neurons that are tuned for a specific parameter and that share lateral, inhibitory connections, gain normalization removes the nonspecific effect of the overall intensity of the stimulus. This allows each neuron's firing rate to reflect the strength of the image at that neuron's preferred (tuned) value.
- Recurrent inhibition
-
Inhibition that comes from lateral connections that the inhibited cells make with neurons in the same cortical area.
- Complex cells
-
V1 neurons that are thought to represent a stage that lies one level above simple cells in the motion-processing stream. Complex cells probably combine input from simple cells with similar frequencies but different phase tuning. They tend to be phase-insensitive.
- Spike-triggered correlation and covariance
-
Neurons are sometimes responsive to specific combinations of stimulus properties (for example, luminance at different locations). As a result, these combinations tend to occur just before a spike. Spike-triggered correlation and covariance measure the average pattern of correlation that occurs in the moments preceding a spike.
- Vector sum
-
Every two-dimensional velocity has two components: the horizontal velocity, Vx, and the vertical velocity, Vy. Given a set of velocity vectors, the vector sum itself has two components: one the sum of the Vx, the other the sum of the Vy. The vector average is the vector sum divided by the number of vectors.
- Surround inhibition
-
Modulation of a visual neuron that results from the presence of a stimulus in a defined 'surround' region outside the classical receptive field of a visual neuron; as the name implies, the effect is usually but not always inhibitory.
- Static nonlinearities
-
Static nonlinearities occur when a (temporally) linear filtering operation is performed and then the output (the firing rate) is transformed with some nonlinear mechanism, such as rectification or saturation. Static nonlinearities tend to scale the output but do not affect the overall selectivity of the mechanism.
- Maximum-likelihood estimation
-
A process in which the probability of each of a sample of multiple random variables (such as neural firing rates for a certain stimulus) is inspected and then an overall estimate of the probability (termed the likelihood) of this set of observations is taken. Thus, one method of stimulus discrimination is to choose the stimulus for which the likelihood estimate is greatest.
- Endstopping
-
A process in which neurons respond well to small spots but poorly to long contours that go beyond their receptive field. Endstopped neurons were first defined by Hubel and Wiesel as hypercomplex cells. The idea is that the ends of the contour tend to stop the response.
- Spectral power
-
Taking the Fourier transform of a function gives its amplitude and its phase as a function of its frequency. The amplitude portion is called the amplitude spectrum, and the square of this is the power spectrum. The area under the power spectrum over any specific range of frequencies is called the spectral power in that frequency band.
Rights and permissions
About this article
Cite this article
Bradley, D., Goyal, M. Velocity computation in the primate visual system. Nat Rev Neurosci 9, 686–695 (2008). https://doi.org/10.1038/nrn2472
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrn2472
This article is cited by
-
Speed-Selectivity in Retinal Ganglion Cells is Sharpened by Broad Spatial Frequency, Naturalistic Stimuli
Scientific Reports (2019)
-
Natural motion trajectory enhances the coding of speed in primate extrastriate cortex
Scientific Reports (2016)
-
Computational neuroscience and localized neural function
Synthese (2016)
-
Efficient Spiking Neural Network Model of Pattern Motion Selectivity in Visual Cortex
Neuroinformatics (2014)