Bringing the real world into the fMRI scanner: Repetition effects for pictures versus real objects

Snow, Jacqueline C.; Pettypiece, Charles E.; McAdam, Teresa D.; McLean, Adam D.; Stroman, Patrick W.; Goodale, Melvyn A.; Culham, Jody C.

doi:10.1038/srep00130

Download PDF

Article
Open access
Published: 26 October 2011

Bringing the real world into the fMRI scanner: Repetition effects for pictures versus real objects

Jacqueline C. Snow¹,
Charles E. Pettypiece²,
Teresa D. McAdam²,
Adam D. McLean¹,
Patrick W. Stroman³,
Melvyn A. Goodale^1,2 &
…
Jody C. Culham^1,2

Scientific Reports volume 1, Article number: 130 (2011) Cite this article

6526 Accesses
95 Citations
26 Altmetric
Metrics details

Subjects

Abstract

Our understanding of the neural underpinnings of perception is largely built upon studies employing 2-dimensional (2D) planar images. Here we used slow event-related functional imaging in humans to examine whether neural populations show a characteristic repetition-related change in haemodynamic response for real-world 3-dimensional (3D) objects, an effect commonly observed using 2D images. As expected, trials involving 2D pictures of objects produced robust repetition effects within classic object-selective cortical regions along the ventral and dorsal visual processing streams. Surprisingly, however, repetition effects were weak, if not absent on trials involving the 3D objects. These results suggest that the neural mechanisms involved in processing real objects may therefore be distinct from those that arise when we encounter a 2D representation of the same items. These preliminary results suggest the need for further research with ecologically valid stimuli in other imaging designs to broaden our understanding of the neural mechanisms underlying human vision.

Ramp-shaped neural tuning supports graded population-level representation of the object-to-scene continuum

Article Open access 27 October 2022

Jeongho Park, Emilie Josephs & Talia Konkle

Dynamics of fMRI patterns reflect sub-second activation sequences and reveal replay in human visual cortex

Article Open access 19 March 2021

Lennart Wittkuhn & Nicolas W. Schuck

Coexisting representations of sensory and mnemonic information in human visual cortex

Article 01 July 2019

Rosanne L. Rademaker, Chaipat Chunharas & John T. Serences

Introduction

Almost all functional magnetic resonance imaging (fMRI) studies that have examined the human neural substrates of object processing have utilized 2-dimensional (2D) pictures of objects. Although pictures are ubiquitous in everyday life, we interact with real 3-dimensional (3D) objects far more often than 2D representations. Moreover, we have little difficulty in distinguishing between the two. Numerous cortical areas have been identified in the perception of object shape but the neural mechanisms involved in the perception of real 3D objects have received scant investigation with fMRI. In this study we ‘bring the real world into the scanner’ to examine whether the large body of evidence pertaining to human neural processing of pictorial stimuli is applicable also to real-world objects.

The processing of object shape in humans is broadly distributed across a number of cortical areas spanning both the dorsal and ventral visual pathways. Most notably, object-selective neural populations have been identified within the ventral stream along a swathe of inferior temporal cortex known as lateral occipital complex (LOC)^1,2. The LOC is dedicated to processing object shape independent of the low-level image features that define the shape. Area LOC produces robust responses to objects depicted in a range of formats including greyscale images, line drawings, silhouettes, shapes defined by motion or textures, or when the percept of form is induced by an illusory contour³. Additional object-selective regions have also been identified within the ‘dorsal’ processing stream particularly along the intraparietal sulcus (IPS)^3,4,5,6.

Beyond simple fMRI subtraction designs, neural coding within object-selective cortex has been further investigated using comparisons between repeated vs. unrepeated objects^7,8,9,10,11. The characteristic reduction in haemodynamic response with stimulus repetition has been variously referred to as ‘fMR adaptation’ (fMR-A)^7,12,13, or ‘repetition suppression’^14,15. fMR-A is a robust effect that is a putative analogue of a similar effect seen in nonhuman primates in which neurons within infero-temporal cortex show reduced firing rates as a result of stimulus repetition^16,17. Repetition designs have become a popular methodological approach that contrast with standard mapping techniques in their ability to probe neural selectivity in higher-order visual areas at a sub-voxel scale beyond that of traditional fMRI designs^12,15,18. In the field of object perception, repetition designs have perhaps most commonly been used to determine whether object-selective neural populations are response invariant to image transformations such as changes in viewpoint, size or illumination^4,7.

Repetition effects have been observed in human object-selective cortex with a variety of 2D image types. These include simplified monochrome shapes⁴, silhouettes¹⁹ and line drawings that convey object structure via contours^5,7 or integrated elements^20,21. Repetition effects have also been demonstrated with ‘richer’ stimuli such as greyscale photographs or other detailed images that provide more information about an object's 3D characteristics via shading and texture^{4,7,10,22,23,24}, or that induce the percept of depth so that they appear to lie in front of the fixation plane^19,25.

While this approach has been highly fruitful, we wondered how well this large body of results would generalize to realistic 3D objects. The choice of 2D stimuli to study object recognition has been largely one of convenience and experimental control. The presentation of 2D images simply requires projection of the images onto a flat screen viewed through a mirror by the participant who can lie comfortably in the supine position; moreover, the control of image parameters (e.g., size, depth, timing) is straightforward. Many additional challenges arise in the presentation of real world 3D stimuli; however, many of these problems have been solved in fMRI research on grasping and reaching where 3D objects are required to elicit normal object-directed actions^26,27,28. Such approaches involve tilting of the head and head coil to enable direct viewing of real 3D objects within reachable space ( Figure 1a ). These configurations offer realistic presentations of objects in which (a) all binocular and monocular depth cues are consistent, (b) retinal size, viewing distance and expected size are consistent and (c) the location within reachable space means that objects may afford real actions such as manipulation²⁹. Given these differences, we investigated whether the effects obtained with 2D images would be corroborated in a richer, more realistic context.

Here we used an fMR repetition paradigm to examine both the overall level of activation and repetition-based effects in the context of real-world 3D objects compared to 2D pictures. We expected clear activation and repetition effects within the ventral and dorsal stream areas identified across prior studies for both stimulus classes. However, the main question was how similar these effects would be for 3D objects. We anticipated that the overall level of activation as well as the strength of repetition effects for the richer, real-world 3D objects would be at least equal to, if not greater than, those for 2D pictures, particularly within the dorsal stream³⁰. Neurophysiology research has characterized several areas within the macaque dorsal stream with 3D object-selective responses, including the anterior intraparietal area (AIP)^31,32,33,34, lateral intraparietal area (LIP)³⁵ and caudal intraparietal sulcus (cIPS)³⁶, areas for which human homologues have been proposed³⁷. These areas are postulated to be involved in the extraction of 3D shape for visuomotor transformations associated with the control of action³⁸. Given that human dorsal stream areas show fMR-A with repeated 2D object images^4,5 and respond strongly to 3D objects³⁹, such areas may be expected to show larger responses and stronger repetition effects in the context of real-world objects.

Results

We investigated neural object representations associated with 2D pictures and real 3D stimuli within known object selective areas of human cortex. Previous fMR-A paradigms have reported robust repetition effects within the LOC when comparing repeated versus different 2D object images. Here we asked whether real 3D objects elicit a similar pattern. A slow event-related fMR-adaptation design ( Figure 1c ) was employed in which two objects appeared sequentially on each trial. Blood-oxygen-level dependent (BOLD) responses were compared across trials in which paired objects had the same identity (‘Repeat’ condition) versus trials where they were not the same (‘Different’ condition). Repetition effects were measured across two classes of stimuli: real-world 3D objects and 2D colour photographs of the same objects ( Figure 1a,b ) that were matched in all possible respects for size, distance, viewpoint and illumination. We examined repetition effects across the whole brain and within independently defined sub-regions of object-selective LOC.

Region of interest (ROI) analyses

Because of the wealth of past studies showing object selectivity and fMR repetition effects for object images in LOC, our initial analyses utilized a region of interest (ROI) approach to identify LOC within individuals based on an independent localizer run and then extract its pattern of activation from separate experimental runs. LOC was localized by contrasting epochs containing pictures of objects and shapes with those of their scrambled counterparts (see Methods). In accordance with early studies that reported fMR-A effects using 2D stimuli^7,19, we searched within two sub-divisions of LOC: an anterior-ventral portion in the posterior fusiform sulcus (pFS) and a posterior-dorsal portion of LOC (LO). Based on previous findings we anticipated that on Different trials where object identity changes BOLD responses should be maximal, whereas on Repeat trials, where paired objects shared the same identity, the BOLD response should be comparatively attenuated. Importantly, we anticipated that the pattern of repetition effects would be similar for 2D and 3D stimuli (if not greater in magnitude for real 3D objects).

To validate our design and procedure, fMRI signals were first compared on event-related trials involving 2D pictures. Time courses of fMRI signals on Different versus Repeat trials involving pictures are displayed in Figure 2 , for LO and pFS (left upper and lower panels, respectively). To quantify repetition effects and compare them across the different stimulus types, we used an adaptation index (AI) which estimates response difference between Repeat and Different conditions relative to the overall fMRI response to a given stimulus⁴. Positive index values reflect higher responses on Different than Repeat trials; negative values indicate the reverse pattern and values around zero indicate a lack of repetition effects. AIs were calculated using mean activation (β coefficients) in the Different versus Repeat conditions for each stimulus type and the magnitude of repetition effects contrasted using a one-sample t-test against zero and paired-samples t-tests.

Figure 3 plots the AIs for 2D pictures and 3D objects in LO and pFS. To provide meaningful data interpretation in a within-subjects design^40,41 error bars in Figure 3 represent 95% confidence interval (CI) of the difference from zero. Robust repetition effects for 2D pictures was observed within both LO (t(12) = 3.68, p = 0.003) and pFS (t(12) = 5.38, p <0.0001) sub-regions of LOC. These findings replicate those of previous studies^4,5,7,10,12 and confirm that our design and stimuli were sufficiently sensitive to demonstrate repetition effects.

Next we examined whether similar effects would be observed on 3D object trials that were randomly intermixed with 2D picture trials. Time courses of fMRI signals for 3D objects on Different versus Repeat trials within LO and pFS are displayed in Figure 2 (right upper and lower panels, respectively). Although a qualitatively small change in BOLD signal was evident in the time courses of the Repeat condition relative to the Different condition, the magnitude of this effect was qualitatively attenuated compared to that observed for 2D pictures. Planned comparisons confirmed that for 3D objects, repetition effects did not reach statistical significance in LO (t(12) = 0.88, p = 0.392). In pFS, repetition effects also did not reach statistical significance (t(12) = 1.99, p = 0.057), although there was a clear trend in this direction in this more anterior sub-portion of the LO complex. Finally, a paired-samples t-test contrasting the AIs for 2D versus 3D stimuli in each ROI revealed a trend toward significance between the AIs for 2D versus 3D stimuli in LO (t(12) = 2.04, p = 0.06), but no significant differences between AIs in pFS (t(12) = 0.05, p = 0.29).

As an index of between-subject consistency the proportion of observers who showed greater fMRI BOLD response on Different versus Repeat trials was calculated for each stimulus type and ROI. The observed direction of β coefficients (e.g., a binary score reflecting Different > Repeat, or Repeat > Different) across all participants was compared to the distribution of scores to be expected by chance alone (e.g., a test of the null hypotheses that Different > Repeat in 50% of subjects) using Pearson's chi-square test. For 2D picture trials, 12/13 subjects showed effects in the expected direction (i.e., Different > Repeat) within LO (χ² = 9.31, p<0.005) and all subjects showed this pattern within pFS, indicating that the frequency of the pattern was not attributable to chance alone. Conversely, for 3D object trials fewer subjects showed effects in the expected direction. The observed proportions were not significantly above chance levels in LO (8/13 subjects; χ² = 0.69, p>0.40), or pFS (10/13 within pFS; χ² = 3.77, p>0.05), although there was a trend toward significance in pFS.

In summary, we found robust repetition effects for repeated 2D pictures within both LO and pFS sub-regions of LOC and this pattern was highly consistent across individuals. Surprisingly, however, repetition effects were attenuated for trials involving real 3D objects; we did not observe significant repetition effects within LO or pFS sub-regions of object-selective cortex. Furthermore, the direction of effects in Different versus Repeat conditions varied across subjects in both ROIs suggesting that changes in 3D object identity did not have a reliable influence on the BOLD response.

Voxel-wise group analyses

Group-based voxel-wise GLM analyses were subsequently performed to explore repetition effects at the whole-brain level and specifically to determine whether there was evidence for repetition-based BOLD changes on 3D object trials outside of LOC. We first ran the contrast [+2D Different −2D Repeat] to identify regions showing significant repetition effects for 2D pictures (using a threshold of p<0.005, cluster size threshold corrected). Figure 4 illustrates the group results displayed on the cortical surface of a representative participant. As expected, significant areas of activation were observed within established regions of object-selective cortex. Large bilateral clusters were observed along lateral and ventral occipito-temporal cortex, including fusiform, lingual, lateral occipital and inferior temporal regions. Similar activation was also evident within ‘dorsal stream object areas’, extending from the expected location of anterior V3, dorsally into the intraparietal sulcus (IPS) anterior to the expected location of IPS-0⁴². In sharp contrast, an analogous comparison for 3D stimuli (using the contrast [+3D Different −3D Repeat] at the same p-value threshold) revealed no significant areas of positive activation, either cortically or sub-cortically ( Table 1 ). In fact, the reverse contrast [+3D Repeat −3D Different] revealed several clusters of significant activation consistent with a pattern of ‘repetition enhancement’ (i.e., greater BOLD response on Repeat than Different trials).

Table 1 Voxelwise Group Results. Talairach coordinates and cluster size for identified regions.

Full size table

We then searched for areas in which activation was significantly different for 2D than 3D stimuli (collapsed across Repeat and Different trials) using the contrasts [+2D−3D] and [+3D−2D] ( Table 1 ). The comparison [+2D−3D] revealed two small clusters of positive activation: one cluster centered at the occipital pole (V1) of the RH calcarine sulcus and another in the inferior temporal gyrus of the RH. The comparison [+3D−2D] revealed no positive activation. The representation of our 2D pictures and real-world 3D instances of the same objects therefore shared the same anatomical loci. Finally, any interaction between Stimulus Type and Repetition was examined using the contrasts (a): +3D Different −3D Repeat +2D Different +2D Repeat (i.e., greater repetition effects for 3D than 2D stimuli) and (b): +2D Different −2D Repeat −3D Different +3D Repeat (i.e., greater repetition effects for 2D than 3D stimuli). Brain areas showing greater repetition effects for 2D than 3D stimuli again included largely bilateral swathes of activation around the lingual and fusiform gyri and superior temporal sulci, as well as clusters in the left parieto-occipital fissure and middle frontal gyrus of the RH. The reverse interaction contrast (i.e., greater repetition effects for 3D than 2D stimuli) revealed no positive activation clusters.

Comparisons with Foci from Prior Studies

Finally, we sampled group activation within a number of additional ROIs that correspond to areas previously implicated in 3D form processing^4,39,43 (see Figure 5 ). Across a total of 14 ROIs spanning early visual, temporal and parietal cortex, we found significant 3D repetition effects in just two areas; one roughly corresponding to V3A and another within left-sided ‘LOtv’ – a putative visuo-tactile ‘multimodal’ sub-component of the LO complex situated along the ventro-lateral bank of the temporal lobe⁴³. In contrast, significant (or close to significant) 2D repetition effects were found in almost all of the additional ROIs (see Supplementary Table 1 ).

Discussion

Here we used slow event-related fMRI to contrast repetition-related changes in fMRI responses to 2D pictures of objects with real-world 3D exemplars. Whereas presentation of 2D pictures elicited strong repetition-related changes in the BOLD response, the same effect was surprisingly weak, if not absent, in the context of real-world 3D objects. We searched for repetition effects within discrete regions of object-selective cortex and across the whole brain. Contrary to our expectations, manipulating 3D object identity (using Repeated versus Different objects) did not produce a significant change in BOLD response within LOC. Further, within this area there was marked variability across participants in the relative magnitude of the BOLD response in Repeat versus Different 3D object conditions. Indeed, within area LO individual participants were just as likely to show a stronger BOLD response on Repeat object-identity trials for 3D objects than on Different trials for 3D objects – a pattern sometimes labeled as ‘repetition enhancement’^{2,7,22,24,44,45,46,47}. In line with these results, an analysis of group effects at the whole brain level also revealed no evidence of fMR-repetition effects on 3D object trials.

The results for real-world 3D objects contrast sharply with those for 2D object images. In line with previous reports, participants in our study showed robust fMRI repetition-based changes on randomly interleaved trials that involved 2D pictures. In the ROI analyses, significant 2D repetition effects were observed within both LO and pFS sub-regions of LOC and BOLD response patterns were highly consistent across observers. Accordingly, whole-brain analyses revealed robust repetition effects for 2D objects that spread anteriorly and bilaterally along classical ventral stream object-selective cortex and dorsally along putative object-selective cortex in the vicinity of the IPS. Finally, we found evidence for 2D repetition effects within a number of additional ROIs that correspond to areas previously implicated in 3D form processing^4,39,43. The same pattern was not observed for 3D stimuli.

Whole brain analyses confirmed that activation patterns were strikingly similar for our 2D pictures and 3D object trials, confirming that our stimulus sets were matched for low-level properties (including illumination, size, colour and viewpoint). We further quantified repetition effects using an adaptation index to account for possible underlying differences in responsivity across different brain areas to our paired 2D and 3D stimulus events⁴. The effect we observed for 2D vs. 3D stimulus classes is unlikely to be attributable to differences in eye movement patterns or shifts of attention. Our tilted-head setup precluded the use of an eye-tracker; however, all participants reported that they were able to easily discriminate all stimuli while maintaining their gaze on the fixation point. Moreover, no activation differences between 2D and 3D objects were found in eye-movement- and attention-related areas, such as the frontal eye fields or parietal cortex^48,49,50. Further, given that participants merely passively viewed the stimuli, differences in task-related attentional demands were also unlikely. It is possible that observers found the 3D objects “more interesting” than their 2D counterparts. If that were the case, however, then one would have expected to see greater activation in LOC and other object-related areas with 3D as opposed to 2D and amplified repetition effects for 3D compared to 2D stimuli^51,52. But we found exactly the opposite.

Given that explanations based on attention or eye-movements are unlikely, our results may reflect differences in the way real world 3D objects are processed as compared to 2D pictures. Real objects differ from pictures in several important respects: (a) they possess additional shape information from stereoscopic cues such as vergence and disparity, (b) both monocular and binocular cues to object shape are consistent for real objects and (c) 3D objects are tangible substances that exist in the environment. The possible contribution of each of these differences between pictures and real objects to our observed findings is considered in turn below.

Given that real objects possess additional shape information from stereoscopic cues compared to pictures, this raises the question of whether or not the same pattern observed for real objects would arise with objects defined by stereopsis alone (i.e. stereograms) where the percept of 3-dimensionality arises entirely from binocular disparity. Neurophysiological studies have identified neurons that are sensitive to shapes defined by binocular disparity within early visual areas^{53,54,55,56,57,58,59}, dorsal areas such as MT and parietal cortex^{60,61,62,63,64,65} and in the inferior temporal cortex^{66,67,68,69,70,71,72,73}. To our knowledge, no human fMRI studies to date have directly compared repetition effects for stereo versus real-world 3D objects, or stereo displays involving objects with 3D structure. Kourtzi and Kanwisher²⁵, used stereo displays involving planar shapes to show that responses within LOC were identical despite changes in the stereoscopic depth of the shape. Similarly, Kourtzi et al.,¹⁹ found equivalent BOLD responses on trials depicting identical silhouette shapes and trials where a 2D silhouette was followed by a stereo silhouette image (so that the shape appeared to lie in front of the fixation plane). These findings imply that object shape is processed similarly within LOC, whether the shape is depicted in a purely 2D format or with additional stereo cues. Importantly, however, the stimulus objects in these studies had no 3D structure; the stimuli simply defined figure from ground and provided information about the outer contours of the shape (i.e., first-order stereo). Unlike real objects, they contained no information about intrinsic curvature or shape (i.e., second-order stereo). Therefore, it remains an open question as to whether the effects observed here for real world objects would also emerge with stereo displays with objects that possess different second-order shape cues.

Another important difference between pictures and real objects is that the binocular and monocular cues to object shape are completely consistent for 3D objects but are in conflict for 2D pictures. Looking at a picture, binocular cues indicate that it is completely flat whereas monocular cues such as shading, texture gradients, occlusion, specular highlights and other pictorial cues signify a 3D representation. It is possible that classical repetition and release effects typically observed in picture viewing may be attributable to processes associated with resolving such depth cue conflict. For example, the additional processing required to decipher object identity from 2D pictures as a result of cue conflict could result in a higher fMR response (release from adaptation) on ‘Different’ 2D trials. Further, the similarity in stereo information conveyed by pictures may result in stereo cues being discounted in the analysis of object shape and other pictorial cues weighted more highly. Given that some pictorial cues can be more effective than others in conveying object shape for particular objects, these differences in the cues that are used across trials would result in greater release from adaptation on ‘different’ trials, because different sets of neurons, each tuned to particular pictorial cues, would be engaged in each case. In contrast, because binocular cues like stereo are such powerful indicators of object shape in the case of 3D objects (which may therefore be weighted more highly in the analysis of object shape), the same set of stereo-sensitive neurons that analysis object shape would be engaged – even for different objects.

Finally, our preliminary fMRI results raise the provocative suggestion that the presence of real-world objects (i.e., as indicated initially via stereoscopic cues) invokes qualitatively different computations to those elicited by 2D images. Researchers in the field of behavioral psychophysics have expressed long-standing concern about the extent to which pictures of objects capture the properties of their real-world counterparts (i.e., their ecological validity), with reservations as to their appropriateness as stimuli with which to examine the nature of human object perception^74,75. Indeed, there are clear differences between pictures and objects that suggest some degree of caution in assuming equal neuronal response patterns between the two stimulus classes. Whereas images consist merely of patterns of light arising from a 2D projection surface, real objects are tangible substances that exist in 3D space with a definite texture, reflectance, colour and shape. Real objects, unlike pictures, have an unambiguous size, distance and location relative to the observer – factors that are known to alter single unit responses in macaque inferior temporal cortex⁷⁶. Moreover, as discussed earlier all the cues to depth structure, both binocular and monocular, are congruent for 3D objects. Finally, real objects have properties that relate specifically to the motives and needs of the observer – that is, they provide affordances⁷⁴. An object placed within arm's length affords reaching, grasping and manipulation. Indeed, fMRI studies demonstrate that information about 3D form is critical for the visual control of grasping and manipulation^26,29.

Although comparatively few research studies have been carried out with real-world objects than with 2D images in humans, numerous findings point to the possibility that real objects are cognitively distinct from their 2D counterparts. For example, patients with visual agnosia often show a ‘real object advantage’ in which identification of objects depicted as line-drawings or silhouettes is impaired while recognition performance for real objects remains intact^{77,78,79,80,81}. Similarly, in healthy observers, the value applied to objects is affected by the format in which they are viewed. For example, Bushong et al.,⁸² gave university students a small monetary endowment that could be used to purchase a range of test objects (i.e., food or trinkets). The test items were depicted in one of three formats: text displays, high-resolution images, or actual real-world objects. Surprisingly, students were willing to pay between 40–61% more for objects they viewed as real-world exemplars over the same items depicted in text format or image displays. Moreover, this effect went away when the objects were placed behind a transparent barrier, suggesting that the effect was driven by the potential for interaction with the objects.

In summary, relative to previous research using 2D pictures^5,25,83, our findings indicate that the neural analysis of 3D objects may not fit within the classically defined pattern and that adaptation and corresponding release effects may not be an obligatory consequence of object repetition manipulations¹³. Our results further suggest that the analysis and/or representation of object structure does not proceed independently of the cues that define the object – in this case, when the term ‘object’ is extended to include actual real-world exemplars. The neural mechanisms involved in the perception of real-world 3D objects may therefore be distinct from those that arise when we encounter a 2D planar representation of the very same items. Furthermore, such processes may also change with environmental context – such as whether an object is located within reachable space²⁹. We have highlighted a number of possible routes for future investigation to further elucidate the cognitive and neural mechanisms responsible for the pattern of repetition effects reported here for 2D versus 3D objects. As we have argued, many of the simpler explanations seem unlikely (eye movements, attention), leaving the possibility of inherent differences in the processing of real objects vs. photographs. Whether the invariant neural response we observed for real-world 3D objects is attributable to the additional depth cues provided by binocular vision or the physical presence of the objects, the important finding here is that the underlying response pattern is different from that observed in the context of 2D planar images. Although many fMRI studies have used repetition designs to probe neural sensitivity to different types of stimuli, the computational mechanisms that underlie this effect are not fully understood^{84,85,86,87,88,89}. Regardless of which particular mechanisms account for repetition effects, however, there is no doubt that differential adaptation effects for 2D pictures and 3D objects reflect differences in neuronal processing and interactions.

Due to the technical challenges associated with presenting real world objects within the scanner, we used a slow event-related design. It is possible that the different pattern of repetition effects reported here for 2D versus 3D stimuli are specific to the temporal dynamics of our stimulus presentation. Similarly, the paired adaptation paradigm used in the present study may have a small dynamic range and in the presence of noise, small but nevertheless significant repetition effects may be missed. An important question for future investigation therefore is whether or not the patterns observed here also emerge in the context of different stimulus durations or alternative fMRI designs, such as blocked or rapid event-related designs with more repetitions that yield stronger repetition effects. In any case, if the statistical power of the present design were to be increased, then it is likely that the differences that we have already observed between 2D and 3D stimuli would be amplified rather than reduced.

Our ability to perceive real 3D objects from patterns of light that project on the retina remains one of the most remarkable and yet perplexing aspects of human vision. Yet our understanding of the neural substrate of perception is largely based upon studies that have utilized 2D images. The conventional use of 2D images in fMRI research, in particular, may pose underestimated limits to our understanding of the neural underpinnings of human vision. The human visual system has largely evolved to perceive and interact with a 3-dimensional environment, rather than pictures. Surprisingly, however, there is a paucity of controlled published studies involving real objects and fewer still that directly contrast behavioral or fMR measures across objects and images. We argue here that pictures might represent a limited class of stimuli with which to characterize the neural computations associated with human object recognition⁷⁴. Our findings for real 3D objects suggest some caution in extrapolating experimental results based upon the presentation of abstract or simplified stimuli, or findings drawn from within artificial or constrained environments. Notwithstanding, these results provide an important first step in understanding how real-world stimuli are coded by the human brain and complement a growing body of research^90,91 emphasizing the importance of studying behavior in ecologically valid contexts.