Detecting stable distributed patterns of brain activation using Gini contrast
Research highlights
► Gini contrast is a stable detector of non-linear multivariate patterns in fMRI data. ► Random sampling during learning results in high reproducibility of spatial patterns. ► Higher classification accuracy compared with SVM recursive feature elimination. ► Gini contrast is an inherently multicategory criterion.
Introduction
Functional magnetic resonance imaging (fMRI) allows us to study the relationship between experimental conditions and the brain response at different locations. The traditional analysis methods analyze the data in a univariate fashion, that is, they examine the contributions of different experimental conditions to the fMRI response of each voxel separately (Friston et al., 1994). Recently, a new approach, often referred to as multivariate pattern analysis (MVPA), has emerged that considers patterns of responses across voxels that carry information about different experimental conditions (Haxby et al., 2001). In the multivariate pattern analysis framework, the response of each voxel is considered relevant to the experimental variables not only on its own but also in conjunction with the responses of other spatial locations in the brain. Most multivariate pattern analysis methods train a classifier on a subset of fMRI images in an experiment and use the classifier to predict the experimental conditions in the unseen subset. This approach has proved successful in a variety of applications (Norman et al., 2006, O'Toole et al., 2007).
One of the major challenges of multivariate pattern analysis is that fMRI images contain a large number of uninformative, noisy voxels that carry no useful information about the category label. At the same time, voxels that do contain information are often strongly correlated. When trained with a relatively small number of examples, the resulting classifier is likely to capture irrelevant patterns and suffer from poor generalization performance. To mitigate the first problem, feature selection must be performed before, or in conjunction with, training (De Martino et al., 2008, Pereira et al., 2009).
Furthermore, the ultimate goal of most fMRI experiments is not to achieve high classification performance but to characterize the functional organization of the brain. Identifying the complete set of task-dependent meaningful features promises not only to improve the generalization performance of the learning algorithms but also to provide insights into the structure of the functional areas in the brain. Specifically, a feature selection method can identify regions that process information related to specific stimuli. In light of this exploratory goal, feature selection becomes more than a mere tool in optimally regularizing the learning algorithm, but the main aim of the analysis.
In this paper, we focus on the problem of reproducible feature selection and examine a fully multifeature, multiclass method in application to fMRI analysis that improves upon the previous approaches in terms of the generalization ability of the resulting classifiers, the robustness and completeness of the selected voxel sets, and the stability of the voxel score patterns. We employ the Gini importance measure derived from a random forest (RF) classifier (Breiman, 2001) or Gini contrast to quantify the predictive power of voxels in the selection procedure. This measure captures multivariate and nonlinear relationships among fMRI activations and conditions. The measure is robust to noise, exhibits stability across datasets without a need for explicit regularization, and captures the most informative voxels more accurately than previously demonstrated approaches.
We demonstrate the method on a visual multicategory fMRI study of object perception and recognition. Our experimental results indicate that the proposed method outperforms the commonly used univariate and multivariate feature selection algorithms in terms of reproducibility and ranking of voxels.
This paper is organized as follows. In the next section, we review existing pattern analysis methods used for multivariate pattern analysis in fMRI studies. In the Methods section, we present the training procedure for the random forest classifiers and define the Gini contrast we use for selecting voxels. The same section also reviews our methodology for the empirical comparison across methods. The section on Image Data contains detailed information on the imaging study we used for empirical evaluation of the methods. The Results section reports the experimental results, followed by the Discussion section. We conclude in the last section.
Section snippets
Background and related work
Conventional localization approaches for fMRI analysis focus on explaining the variation in the response of individual voxels. Univariate statistical tests detect voxels whose fMRI response is highly correlated with the experimental variable of interest in a linear model (Friston et al., 1994). Most methods select a subset of the detected voxels that form contiguous blobs in relevant anatomical locations. For example, in the studies of visual object recognition, the localization approach was
Methods
In this section, we formally define the Gini contrast and discuss the empirical evaluation procedure we used to compare different feature selection methods. Our review of the random forest classifiers follows (Breiman, 2001).1 (Hastie et al., 2009) offers an excellent introduction to RF.
Image data
We evaluated the methods in a high-level vision fMRI study that included five subjects (Kanwisher, 2003). The subjects viewed images from eight different categories (Animals, Bodies, Cars, Faces, Scenes, Shoes, Trees, and Vases) in a block-design protocol. During the experiment, subjects viewed sets of eight blocks separated by a fixation period. Each block lasted 16 s, during which 20 images of one category were shown. Each block set contained one block for each category, arranged in a random
Results
We first compare ranking by Gini contrast and univariate criteria and examine the information contained in the regions selected only by the multivariate criterium. Then, we compare Gini contrast to RFE in terms of classification performance and reproducibility.
Discussion
The premise of employing multivariate pattern analysis in fMRI studies is that the relationship between BOLD signals and stimuli can be captured by multivoxel classifiers. Furthermore this approach assumes that the patterns detected reveal information about the role of brain regions during cognitive processes.
The search for selective, or diagnostic regions in the neuroscientific context, is equated with the selection of informative features—a preprocessing step for classification. There are
Conclusion
Identification of diagnostic brain regions by means of classifiers and multivariate patterns requires careful choice of the classifier, the voxel selection criterion, and the inference made from the selected regions. In our experiments, we observed that Gini contrast as a voxel selection score identifies regions detected by univariate criteria and additional informative regions consistently missed by univariate criteria. Regions selected by the Gini contrast measure exhibit substantial overlap
References (48)
- et al.
Empirical characterization of random forest variable importance measures
Comput. Stat. Data Anal.
(2008) - et al.
Functional magnetic resonance imaging (fMRI)“brain reading”: detecting and classifying distributed patterns of fMRI activity in human visual cortex
Neuroimage
(2003) - et al.
Classifying spatial patterns of brain activity with machine learning methods: application to lie detection
Neuroimage
(2005) - et al.
Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns
Neuroimage
(2008) - et al.
Empirical characterization of random forest variable importance measures
Chemometr. Intell. Lab. Syst.
(2006) - et al.
Unsupervised analysis of fmri data using kernel canonical correlation
NeuroImage
(2007) - et al.
Reading hidden intentions in the human brain
Curr. Biol.
(2007) - et al.
Distributed neural systems for the generation of visual images
Neuron
(2000) - et al.
The quantitative evaluation of functional neuroimaging experiments: mutual information learning curves
Neuroimage
(2002) - et al.
The evaluation of preprocessing choices in single-subject bold fmri using npairs performance metrics
Neuroimage
(2003)