Elsevier

NeuroImage

Volume 56, Issue 2, 15 May 2011, Pages 497-507
NeuroImage

Detecting stable distributed patterns of brain activation using Gini contrast

https://doi.org/10.1016/j.neuroimage.2010.07.074Get rights and content

Abstract

The relationship between spatially distributed fMRI patterns and experimental stimuli or tasks offers insights into cognitive processes beyond those traceable from individual local activations. The multivariate properties of the fMRI signals allow us to infer interactions among individual regions and to detect distributed activations of multiple areas. Detection of task-specific multivariate activity in fMRI data is an important open problem that has drawn much interest recently. In this paper, we study and demonstrate the benefits of random forest classifiers and the associated Gini importance measure for selecting voxel subsets that form a multivariate neural response. The Gini importance measure quantifies the predictive power of a particular feature when considered as part of the entire pattern. The measure is based on a random sampling of fMRI time points and voxels. As a consequence the resulting voxel score, or Gini contrast, is highly reproducible and reliably includes all informative features. The method does not rely on a priori assumptions about the signal distribution, a specific statistical or functional model or regularization. Instead, it uses the predictive power of features to characterize their relevance for encoding task information. The Gini contrast offers an additional advantage of directly quantifying the task-relevant information in a multiclass setting, rather than reducing the problem to several binary classification subproblems. In a multicategory visual fMRI study, the proposed method identified informative regions not detected by the univariate criteria, such as the t-test or the F-test. Including these additional regions in the feature set improves the accuracy of multicategory classification. Moreover, we demonstrate higher classification accuracy and stability of the detected spatial patterns across runs than the traditional methods such as the recursive feature elimination used in conjunction with support vector machines.

Research highlights

► Gini contrast is a stable detector of non-linear multivariate patterns in fMRI data. ► Random sampling during learning results in high reproducibility of spatial patterns. ► Higher classification accuracy compared with SVM recursive feature elimination. ► Gini contrast is an inherently multicategory criterion.

Introduction

Functional magnetic resonance imaging (fMRI) allows us to study the relationship between experimental conditions and the brain response at different locations. The traditional analysis methods analyze the data in a univariate fashion, that is, they examine the contributions of different experimental conditions to the fMRI response of each voxel separately (Friston et al., 1994). Recently, a new approach, often referred to as multivariate pattern analysis (MVPA), has emerged that considers patterns of responses across voxels that carry information about different experimental conditions (Haxby et al., 2001). In the multivariate pattern analysis framework, the response of each voxel is considered relevant to the experimental variables not only on its own but also in conjunction with the responses of other spatial locations in the brain. Most multivariate pattern analysis methods train a classifier on a subset of fMRI images in an experiment and use the classifier to predict the experimental conditions in the unseen subset. This approach has proved successful in a variety of applications (Norman et al., 2006, O'Toole et al., 2007).

One of the major challenges of multivariate pattern analysis is that fMRI images contain a large number of uninformative, noisy voxels that carry no useful information about the category label. At the same time, voxels that do contain information are often strongly correlated. When trained with a relatively small number of examples, the resulting classifier is likely to capture irrelevant patterns and suffer from poor generalization performance. To mitigate the first problem, feature selection must be performed before, or in conjunction with, training (De Martino et al., 2008, Pereira et al., 2009).

Furthermore, the ultimate goal of most fMRI experiments is not to achieve high classification performance but to characterize the functional organization of the brain. Identifying the complete set of task-dependent meaningful features promises not only to improve the generalization performance of the learning algorithms but also to provide insights into the structure of the functional areas in the brain. Specifically, a feature selection method can identify regions that process information related to specific stimuli. In light of this exploratory goal, feature selection becomes more than a mere tool in optimally regularizing the learning algorithm, but the main aim of the analysis.

In this paper, we focus on the problem of reproducible feature selection and examine a fully multifeature, multiclass method in application to fMRI analysis that improves upon the previous approaches in terms of the generalization ability of the resulting classifiers, the robustness and completeness of the selected voxel sets, and the stability of the voxel score patterns. We employ the Gini importance measure derived from a random forest (RF) classifier (Breiman, 2001) or Gini contrast to quantify the predictive power of voxels in the selection procedure. This measure captures multivariate and nonlinear relationships among fMRI activations and conditions. The measure is robust to noise, exhibits stability across datasets without a need for explicit regularization, and captures the most informative voxels more accurately than previously demonstrated approaches.

We demonstrate the method on a visual multicategory fMRI study of object perception and recognition. Our experimental results indicate that the proposed method outperforms the commonly used univariate and multivariate feature selection algorithms in terms of reproducibility and ranking of voxels.

This paper is organized as follows. In the next section, we review existing pattern analysis methods used for multivariate pattern analysis in fMRI studies. In the Methods section, we present the training procedure for the random forest classifiers and define the Gini contrast we use for selecting voxels. The same section also reviews our methodology for the empirical comparison across methods. The section on Image Data contains detailed information on the imaging study we used for empirical evaluation of the methods. The Results section reports the experimental results, followed by the Discussion section. We conclude in the last section.

Section snippets

Background and related work

Conventional localization approaches for fMRI analysis focus on explaining the variation in the response of individual voxels. Univariate statistical tests detect voxels whose fMRI response is highly correlated with the experimental variable of interest in a linear model (Friston et al., 1994). Most methods select a subset of the detected voxels that form contiguous blobs in relevant anatomical locations. For example, in the studies of visual object recognition, the localization approach was

Methods

In this section, we formally define the Gini contrast and discuss the empirical evaluation procedure we used to compare different feature selection methods. Our review of the random forest classifiers follows (Breiman, 2001).1 (Hastie et al., 2009) offers an excellent introduction to RF.

Image data

We evaluated the methods in a high-level vision fMRI study that included five subjects (Kanwisher, 2003). The subjects viewed images from eight different categories (Animals, Bodies, Cars, Faces, Scenes, Shoes, Trees, and Vases) in a block-design protocol. During the experiment, subjects viewed sets of eight blocks separated by a fixation period. Each block lasted 16 s, during which 20 images of one category were shown. Each block set contained one block for each category, arranged in a random

Results

We first compare ranking by Gini contrast and univariate criteria and examine the information contained in the regions selected only by the multivariate criterium. Then, we compare Gini contrast to RFE in terms of classification performance and reproducibility.

Discussion

The premise of employing multivariate pattern analysis in fMRI studies is that the relationship between BOLD signals and stimuli can be captured by multivoxel classifiers. Furthermore this approach assumes that the patterns detected reveal information about the role of brain regions during cognitive processes.

The search for selective, or diagnostic regions in the neuroscientific context, is equated with the selection of informative features—a preprocessing step for classification. There are

Conclusion

Identification of diagnostic brain regions by means of classifiers and multivariate patterns requires careful choice of the classifier, the voxel selection criterion, and the inference made from the selected regions. In our experiments, we observed that Gini contrast as a voxel selection score identifies regions detected by univariate criteria and additional informative regions consistently missed by univariate criteria. Regions selected by the Gini contrast measure exhibit substantial overlap

References (48)

  • S. LaConte et al.

    Support vector machines for temporal classification of block design fMRI data

    NeuroImage

    (2005)
  • Z. Lao et al.

    Morphological classification of brains via high-dimensional shape transformations and machine learning methods

    Neuroimage

    (2004)
  • J. Mourão-Miranda et al.

    Classifying brain states and determining the discriminating activation patterns: support vector machine on functional MRI data

    NeuroImage

    (2005)
  • J. Mourão-Miranda et al.

    Dynamic discrimination analysis: a spatial–temporal SVM

    NeuroImage

    (2007)
  • J. Mourão-Miranda et al.

    The impact of temporal compression and space selection on SVM analysis of single-subject and multi-subject fMRI data

    NeuroImage

    (2006)
  • K. Norman et al.

    Beyond mind-reading: multi-voxel pattern analysis of fMRI data

    Trends Cogn. Sci.

    (2006)
  • S.C. Strother et al.

    The quantitative evaluation of functional neuroimaging experiments: the NPAIRS data analysis framework

    Neuroimage

    (2002)
  • Z. Wang

    A hybrid SVM–GLM approach for fMRI data analysis

    Neuroimage

    (2009)
  • Z. Wang et al.

    Support vector machine learning-based fMRI data group analysis

    Neuroimage

    (2007)
  • O. Yamashita et al.

    Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns

    Neuroimage

    (2008)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • L. Breiman

    Consistency for a simple model of random forests

  • T. Carlson et al.

    Patterns of activity in the categorical representations of objects

    J. Cogn. Neurosci.

    (2003)
  • R. Diaz-Uriarte et al.

    Gene selection and classification of microarray data using random forest

    BMC Bioinformatics

    (2006)
  • Cited by (0)

    View full text