Classification of spatially unaligned fMRI scans

doi:10.1016/j.neuroimage.2009.08.036

NeuroImage

Volume 49, Issue 3, 1 February 2010, Pages 2509-2519

https://doi.org/10.1016/j.neuroimage.2009.08.036 Get rights and content

Abstract

The analysis of fMRI data is challenging because they consist generally of a relatively modest signal contained in a high-dimensional space: a single scan can contain millions of voxel recordings over space and time. We present a method for classification and discrimination among fMRI that is based on modeling the scans as distance matrices, where each matrix measures the divergence of spatial network signals that fluctuate over time. We used single-subject independent components analysis (ICA), decomposing an fMRI scan into a set of statistically independent spatial networks, to extract spatial networks and time courses from each subject that have unique relationship with the other components within that subject. Mathematical properties of these relationships reveal information about the infrastructure of the brain by measuring the interaction between and strength of the components. Our technique is unique, in that it does not require spatial alignment of the scans across subjects. Instead, the classifications are made solely on the temporal activity taken by the subject's unique ICs. Multiple scans are not required and multivariate classification is implementable, and the algorithm is effectively blind to the subject-uniform underlying task paradigm. Classification accuracy of up to 90% was realized on a resting-scanned schizophrenia/normal dataset and a tasked multivariate Alzheimer's/old/young dataset. We propose that the ICs represent a plausible set of imaging basis functions consistent with network-driven theories of neural activity in which the observed signal is an aggregate of independent spatial networks having possibly dependent temporal activity.

Introduction

Existing neuroimaging classification methods for functional magnetic resonance imaging (fMRI) data have shown much promise in discriminating among cerebral scans, but are limited in the types of data they can handle, and in the numbers of outcomes they can predict (Ford et al., 2003, Zhang and Samaras, 2005). In general, fMRI discrimination methods require preprocessing steps such as spatial alignment of the scans and are only infrequently suitable for multivariate classification problems (Calhoun et al., 2007) because of their utilization of bivariate classifiers. Spatial alignment algorithms often are constructed assuming a subject has a normal brain, and therefore may be less accurate when warping scans of patients with physical anomalies. Existing classification methods typically require knowledge of the task paradigm thereby limiting their application to subjects who are able and willing to perform such tasks. Here we introduce a procedure called spectral classification that is capable of multivariate discrimination among single-session fMRI scans taken during both a tasked and “mind-wandering” (task-free) state. The methods classify based on the temporal structure of the data rather than the spatial structure, thereby bypassing the need for spatial alignment of the scans. We call this method spectral classification because of its usage of spectral graph-theory measurements for discrimination. We demonstrate here a non-spatial method of classification having cross-validation accuracy rates as high as 90% for bivariate classification. Mathematically we introduce a method for comparing and classifying objects represented by distance matrices. In this paper an entire matrix describes an fMRI scan where the entries contain the “distances” between the activity of two components' timeseries; however these methods are generally applicable to any problem in which the elements are described as matrices rather than isolated points and discrimination is desired among these objects.

Temporally recorded neuroimaging data pose a unique challenge to classification because of the high-dimensional structure of the data sets. One scan can contain more than 120,000 recordings that often are highly correlated both in only four effective dimensions consisting of space and time. Because of this, practical classification procedures require an initial dimension-reduction stage where discriminating signal is extracted from the noisy data. In spatial-based discrimination methods, localized summaries of the temporal signal are used to compress the temporal dimension into a single point at every spatial location. The spatial regions containing discriminating summary statistics are extracted and used to create a classification machine. (Zhang and Samaras, 2005, Ford et al., 2003).

The summary statistics used for describing temporal activity include mean signal intensities or p-values measuring association with a known task-paradigm. These regional summary statistics are compared across subjects when training the classifier, requiring the scans be spatially aligned to a common atlas space. The most often used alignment algorithms (Woods et al., 1998) are 12-parameter affine transformations that warp a subject's brain to a common atlas space. Alignment precision is limited with normal patients by the low geometric flexibility of the algorithms, and is potentially more difficult to achieve with subjects having structural inconsistencies associated with mental disorders. For example, it is known that people with schizophrenia have significantly larger ventricles (Shenton et al., 2001) and that Alzheimer's sufferers show brain atrophy (Ridha et al., 2006); standard structural alignment tools cannot take into account the unique differences existing in these patients. Thus, spatially based discrimination methods may fail in classifications across individuals due simply to poor spatial alignment.

A known task function is often correlated regionally with timeseries to identify regions closely associated with a task. Improved alignment methods notwithstanding, localized low-order summary statistics of the regional BOLD signal may not capture higher-order discriminating information contained in the temporal domain. If functional anatomy is similar among patient groups, then the temporal information of the scans offer a new dimension with potentially discriminative information. If the group differences exist not in the spatially localized signal summary but in the native temporal activity taken by the brain, classification methods relying on summary statistics could fail to distinguish between groups. A method that instead reduced the often-redundant spatial dimension while keeping intact the temporal structure would capitalize on signal differences existing in the temporal domain rather than spatial. The method proposed here is agnostic to the task function and yields similar accuracy results discriminating among identically tasked scans and untasked scans in two datasets tested here.

Because of the limitations of spatial discrimination methods, there is a need for a classification method that is both insensitive to spatial alignment and independent of low order statistical summaries. Using unaligned scans our method classifies on temporal activity patterns between independent components within a subject. The blind source separation method of independent components analysis (ICA) is capable of decomposing a sequence of three-dimensional images into sources consisting of statistically independent spatial maps acting over time according to possibly dependent activity patterns. When applied to fMRI data, ICA decomposes a four-dimensional single fMRI scan into a set of statistically independent spatial components (Hyvärinen and Oja, 2000). These spatially independent components have corresponding time courses that show statistical dependence with the time courses of other components. The strength of the relationship between components is indicated by coupling, or correlated intensities over time.

It is not known which if any of the spatial components identified by ICA represent functional neural networks, however it has previously been shown that ICA-methods yield identifiable stable neurological patterns. Damoiseaux et al. (2006) were able to identify 10 consistent resting state networks common across their population that appear to correspond to identifiable phenomena such as motor function, visual processing, executive functioning, auditory processing, memory, and even the default-mode network, however the identification of these components is not required with our approach to classification yet remains a hidden layer that might be useful for neuroscientific interpretation. The general goal of our work is to develop a classification method that is independent of any trained user interaction making the tool more practically applicable and less sensitive to experimenter bias. One consequence of this, as implemented here, is that the classification itself may be based on signals that are not directly interpretable as neural in nature. For example, it is possible that group specific artifacts, such as head motion, might be contributing to the classifier. For the moment, we note that even in the face of this potential limitation, the classifier appears quite robust. In the future, we intend to use automated means to detect and reject identifiable artifacts (such as motion). Because the time courses alone are used for discrimination our method does not require us to associate the spatial components with a known biological process to classify a scan; rather, we are concerned with the temporal structure that these components take, how similar they are with other components in that subject, and how this dependency varies across subjects and groups.

In the classification method described here, inter-subject component comparisons do not require multiple scans or knowledge of the underlying task paradigm. We describe here the application of our methods using two separate datasets. The first consists of blocked-task designed scans from normal old, normal young, and Alzheimer's patients, while the second dataset consists of resting-state scans of Schizophrenia subjects and normal controls. We estimated the classification testing accuracy using cross-validation (C.V.) and the out-of-bag error from the random forests (R.F.) (Breiman, 2001) classifier, where the accuracy is an estimate of how well the classifier would do if given a new scan from a previously unseen subject.

Random forests is a decision-tree machine learning method that creates many classification trees by resampling from both the observations and classifiers at each node and subsequently making decision rules to minimize the misclassification rate of the sampled data within each tree. Many decision trees are constructed and combined to create a “forest” that decides an observation's class by voting over the decisions made by each tree. The tree is then tested on observations that weren't selected in the initial sampling, to give the “out-of-bag” error which is usually an unbiased estimate of the testing error.

Section snippets

Overview

The first step in spectral classification is to perform ICA individually on the scans to reduce the dimensions of the data and extract the time courses of the components. We then create distance matrices that capture the relationship between the temporal signals within a subject, and extract features from these similarity matrices using the principal (largest) eigenvalues. Finally, we train a random forests classifier on the extracted features and evaluate the out-of-bag and cross-validation

Results and discussion

The spectral classification procedure was run on both the schizophrenia/normal and the Alzheimer's/old/young dataset to obtain bivariate and multivariate classification results. The Alzheimer's/old/young dataset was also grouped into pairs to further test bivariate classification. There were two parameters involved in fitting the manifold: the neighborhood size, k̄, and n_i, the number of dimensions in which to embed. We present the results using two different parameter selection methods. For

Conclusion

The methods developed here can be seen as comparing interactions of spatially independent components over time within a subject and seeking differences in these interactions across groups. Mathematically, we are trying to discriminate among distance matrices, while geometrically we are comparing a group of points (components) in some unknown subject-defined space to another group of points in a different subject's space. Using the geodesic similarity unwinds the shape that each group of point

Acknowledgments

This research was partially supported by NSF grants 0716055 and 0442992, and by NIH Roadmap for Medical Research, NCBC grant U54 RR021813 and grant number R21DA026109. A. Anderson received support under T90DA022768, J. Quintana was supported by a VA Merit Review Award and J. Sherin by Kempf Fund Award from the American Psychiatric Association.

We would like to thank YingNian Wu for helpful discussions and Tiffany O. Wong and Michael B. Marcus for their assistance.

References (22)

GrassbergerP. et al.
Measuring the strangeness of strange attractors
Phys. D: Nonlinear Phenom.
(1983)
HyvärinenA. et al.
Independent component analysis: algorithms and applications
Neural. Netw.
(2000)
M.J. Jafri et al.
A method for functional network connectivity among spatially independent resting-state components in schizophrenia
NeuroImage
(2008)
RidhaB.H. et al.
Tracking atrophy progression in familial Alzheimer's disease: a serial MRI study
Lancet Neurol.
(2006)
RoebroeckA. et al.
Mapping directed influence over the brain using granger causality and fmri
NeuroImage
(2005)
ShentonM. et al.
A review of MRI findings in Schizophrenia
Schizophr. Res.
(2001)
BreimanL.
Random forests
Mach. Learn.
(2001)
BucknerR.L. et al.
Functional brain imaging of young, nondemented, and demented older adults
J. Cogn. Neurosci.
(2000)
CalhounV.D. et al.
Temporal lobe and default hemodynamic brain modes discriminate between schizophrenia and bipolar disorder
Hum. Brain Mapp.
(2007)
DemirciO. et al.
A review of challenges in the use of fMRI for disease classification/characterization and a projection pursuit application from multi-site fMRI schizophrenia study
Brain Imag. Behav.
(2008)

Ford, J., Farid, H., Makedon, F., Flashman, L.A., Mcallister, W., Megalooikonomou, V., Saykin, A.J., 2003. Patient...

Cited by (30)

Network clustering via kernel-ARMA modeling and the Grassmannian: The brain-network case
2021, Signal Processing
This paper demonstrates that all clustering tasks in a dynamic (brain) network, i.e., state clustering, community detection, and subnetwork state-sequence clustering, can be addressed by a novel unifying network-clustering framework. The connecting threads of the components of the proposed framework are: a novel kernel-based autoregressive-moving-average (ARMA) model which propels feature extraction from the network time-series, and the Riemannian geometry of the Grassmann manifold (Grassmannian) into which the extracted features are mapped. Clustering of the Grassmannian features is performed via the novel extension of a recently introduced algorithm which capitalizes on the Grassmannian distances and angular information of the point-cloud of features. Numerical tests on synthetic and real functional-magnetic-resonance-imaging (fMRI) data showcase the favorable performance of the proposed scheme against state-of-the-art network-clustering and manifold-learning methods, and corroborate the claim of this paper that the proposed framework can serve as a useful data-analytic toolbox for network(-neuroscience) research.
Cortical abnormalities and identification for first-episode schizophrenia via high-resolution magnetic resonance imaging
2020, Biomarkers in Neuropsychiatry
Evidence from neuroimaging has implicated abnormal cerebral cortical patterns in schizophrenia. Application of machine learning techniques is required for identifying structural signature reflecting neurobiological substrates of schizophrenia at the individual level. We aimed to develop a method to identify patients with schizophrenia from healthy individuals via the features of cerebral cortex using high-resolution magnetic resonance imaging (MRI).
In this study, cortical features were measured, including volumetric (cortical thickness, surface area, and gray matter volume) and geometric (mean curvature, metric distortion, and sulcal depth) features. Patients with first-episode schizophrenia (n = 52, ranging 17–45 years old) and healthy controls (n = 66, ranging 18–46 years old) were included from the Department of Psychiatry at Xijing Hospital. Multivariate computation was used to examine the abnormalities of cortical features in schizophrenia. Features were selected by least absolute shrinkage and selection operator (LASSO) method. The diagnostic capacity of multi-dimensional neuroanatomical patterns-based classification was evaluated based on receiver operating characteristic (ROC) analysis.
Mean curvature (left insula and left inferior frontal gyrus), cortical thickness (left fusiform gyrus), and metric distortion (left cuneus and right superior temporal gyrus) revealed both group differences and diagnostic capacity. Area under ROC curve was 0.88, and the sensitivity, specificity, and accuracy were 94 %, 82 %, and 88 %, respectively. Confirming these findings, similar results were observed in the independent validation (sensitivity of 91 %, specificity of 78 %, and accuracy of 85 %). There was a positive association between index score derived from the multi-dimensional patterns and the severity of symptoms (r = 0.33, P < .05) for patients.
Our findings demonstrate a view of cortical differences with capacity to discriminate between patients with schizophrenia and healthy population. Structural neuroimaging-based measurements hold great promise of paving the road for their clinical utility in schizophrenia.
Identifying first-episode drug naïve patients with schizophrenia with or without auditory verbal hallucinations using whole-brain functional connectivity: A pattern analysis study
2018, NeuroImage: Clinical
Many studies have focused on patients with schizophrenia with or without auditory verbal hallucinations (AVHs), but due to the complexity of schizophrenia, biologically based diagnosis of patients with schizophrenia remains unsolved. The objectives of this study are to classify between first-episode drug-naïve patients with schizophrenia and healthy controls, and to classify between patients with and without AVHs. Resting state fMRI data from 41 patients with schizophrenia (22 with and 19 without AVHs) and 23 normal controls (NC) were included to compute functional connectivity between brain regions. Classifiers based on support vector machine (SVM) were developed to classify patients with schizophrenia from NC, as well as between the two subgroups of patients. The classification accuracy was evaluated with a leave-one-out cross-validation (LOOCV) strategy. The accuracy in discriminating both subgroups of patients from NC was 81.3%, with 92.0% (sensitivity) and 65.2% (specificity) for the patients and NC, respectively. The classification accuracy in discriminating patients with and without AVHs was 75.6%, with 77.3% (sensitivity) and 73.9% (specificity) for patients with and without AVHs, respectively. The results suggest that functional connectivity provided good discriminative power not only for identifying patients with schizophrenia among NC, but also in discriminating patients with schizophrenia with and without AVHs.
Decoding the encoding of functional brain networks: An fMRI classification comparison of non-negative matrix factorization (NMF), independent component analysis (ICA), and sparse coding algorithms
2017, Journal of Neuroscience Methods
Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet other mathematical constraints provide alternate biologically-plausible frameworks for generating brain networks. Non-negative matrix factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms (L1 Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks.
The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks within scan for different constraints are used as basis functions to encode observed functional activity. These encodings are then decoded using machine learning, by using the time series weights to predict within scan whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects.
The sparse coding algorithm of L1 Regularized Learning outperformed 4 variations of ICA (p < 0.001) for predicting the task being performed within each scan using artifact-cleaned components. The NMF algorithms, which suppressed negative BOLD signal, had the poorest accuracy compared to the ICA and sparse coding algorithms. Holding constant the effect of the extraction algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (p < 0.001). Lower classification accuracy occurred when the extracted spatial maps contained more CSF regions (p < 0.001).
The success of sparse coding algorithms suggests that algorithms which enforce sparsity, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA. Negative BOLD signal may capture task-related activations.
From estimating activation locality to predicting disorder: A review of pattern recognition for neuroimaging-based psychiatric diagnostics
2015, Neuroscience and Biobehavioral Reviews
Psychiatric disorders are increasingly being recognised as having a biological basis, but their diagnosis is made exclusively behaviourally. A promising approach for ‘biomarker’ discovery has been based on pattern recognition methods applied to neuroimaging data, which could yield clinical utility in future. In this review we survey the literature on pattern recognition for making diagnostic predictions in psychiatric disorders, and evaluate progress made in translating such findings towards clinical application. We evaluate studies on many criteria, including data modalities used, the types of features extracted and algorithm applied. We identify problems common to many studies, such as a relatively small sample size and a primary focus on estimating generalisability within a single study. Furthermore, we highlight challenges that are not widely acknowledged in the field including the importance of accommodating disease prevalence, the necessity of more extensive validation using large carefully acquired samples, the need for methodological innovations to improve accuracy and to discriminate between multiple disorders simultaneously. Finally, we identify specific clinical contexts in which pattern recognition can add value in the short to medium term.
Towards the identification of imaging biomarkers in schizophrenia, using multivariate pattern classification at a single-subject level
2013, NeuroImage: Clinical
Citation Excerpt :
Adequate sample size is an important consideration in the robustness and reliability of the proposed classification system. Classification models based on small sample sizes tend to favor diagnostic performance (Anderson et al., 2010; Fan et al., 2007; Kawasaki et al., 2007; Sun et al., 2009; Yang et al., 2010; Yoon et al., 2007) whereas in studies evaluating larger samples, which possibly include a wider range of phenotypic manifestations of schizophrenia, classification accuracy tends to be worse (Greenstein et al., 2012; Nieuwenhuis et al., 2012; Zanetti et al., 2013). Differences in the image analysis and classification pipelines might, also, partly explain such variation in findings.
Standard univariate analyses of brain imaging data have revealed a host of structural and functional brain alterations in schizophrenia. However, these analyses typically involve examining each voxel separately and making inferences at group-level, thus limiting clinical translation of their findings. Taking into account the fact that brain alterations in schizophrenia expand over a widely distributed network of brain regions, univariate analysis methods may not be the most suited choice for imaging data analysis. To address these limitations, the neuroimaging community has turned to machine learning methods both because of their ability to examine voxels jointly and their potential for making inferences at a single-subject level. This article provides a critical overview of the current and foreseeable applications of machine learning, in identifying imaging-based biomarkers that could be used for the diagnosis, early detection and treatment response of schizophrenia, and could, thus, be of high clinical relevance. We discuss promising future research directions and the main difficulties facing machine learning researchers as far as their potential translation into clinical practice is concerned.

View all citing articles on Scopus

View full text

Classification of spatially unaligned fMRI scans

Abstract

Introduction

Section snippets

Overview

Results and discussion

Conclusion

Acknowledgments

Phys. D: Nonlinear Phenom.

Neural. Netw.

NeuroImage

Lancet Neurol.

NeuroImage

Schizophr. Res.

Random forests

Mach. Learn.

Functional brain imaging of young, nondemented, and demented older adults

J. Cogn. Neurosci.

Temporal lobe and default hemodynamic brain modes discriminate between schizophrenia and bipolar disorder

Hum. Brain Mapp.

A review of challenges in the use of fMRI for disease classification/characterization and a projection pursuit application from multi-site fMRI schizophrenia study

Brain Imag. Behav.