Integrative analysis of the connectivity and gene expression atlases in the mouse brain
Introduction
The mammalian brain contains a large number of cells connected into an interaction network that controls the information flow among neurons (Swanson, 2003, Watson et al., 2010, Watson et al., 2011). The brain connectome plays a pivotal role in generating the cognition, emotion, and perception of an organism. Neurological diseases, such as autism and schizophrenia, are commonly found to be associated with abnormal brain connectivity (Geschwind and Levitt, 2007, Just et al., 2007, Lawrie et al., 2002). Hence, understanding the brain functional circuitry becomes one of the central research themes in neuroscience. At the cellular level, each neuron is largely unique in the sense that it contains a unique combination of proteins that determine how the neuron functions. At the molecular level, the proteins in a neuron are encoded by the genome, which also contains regulatory sequences to control when and where each gene is turned on or off at what level. In other words, the fundamental biochemistry of neurons is determined by spatiotemporal gene expression and regulation encoded into the genomic regulatory networks. This prompts research efforts to characterize the cellular localization of gene expression in the brain and investigate the relationship between genome and connectome (Boguski and Jones, 2004, Bota et al., 2003, Carson et al., 2005, Dong et al., 2009, Ji, 2011, Lichtman and Sanes, 2008, Swanson, 2011, Thompson et al., 2008, Toledo-Rodriguez et al., 2004, Zheng and Rajapakse, 2006).
The initial attempts to investigate the relationship between gene expression and neuronal connectivity focused on the nervous system of Caenorhabditis elegans, because the synaptic connectivity in this organism is known. In Varadan et al. (2006), computational techniques were presented to link gene expression and neuronal connectivity. In addition, sets of synergistically interacting genes were identified based on entropy minimization and Boolean parsimony. Experimental results showed that the synergistic expressions of a subset of genes are predictive of neuronal connectivity. Kaufman et al. (2006) used correlation and prediction analysis assays to study the relationship between gene expression and neuronal connectivity in C. elegans. They showed that the expression signature of a neuron carries significant information about its synaptic connectivity. They also identified a list of putative genes that retain high predictive power. Baruch et al. (2008) studied the molecular markers and logic that direct synapse formation in C. elegans. They built a probabilistic model and attempted to explain the neuronal connectivity diagram of C. elegans as a function of the expression patterns of its neurons. Their results showed that the synaptic connections in C. elegans can be predicted by using the expression patterns of only a small number of genes.
Motivated by prior research results on C. elegans, a few recent studies have attempted to investigate the relationship between gene expression and brain connectivity in the rodent brain. Since the gene expression and brain connectivity data were not available in a single rodent species when those studies were performed, they usually fused data from two different species (French and Pavlidis, 2011, Wolf et al., 2011). Specifically, French and Pavlidis (2011) used the gene expression data of the mouse brain from the Allen Brain Atlas (Sunkin et al., 2013) and the connectivity data of the rat brain from the Brain Architecture Management System (Bota and Swanson, 2008) to study the relationship between gene expression and brain connectivity. By using a series of covariation analysis techniques, they reported that gene expressions in the mouse brain are correlated to the connectivity in the rat brain. In addition, they identified a set of genes that are most correlated with connectivity. Wolf et al. (2011) used the same sets of data and tried to predict regional connectivity in the rat brain by using gene expression data from the mouse brain. They also identified a set of highly predictive genes whose functional roles in disease conditions were evaluated.
In this work, we study the relationship between gene expression and brain connectivity in a single rodent brain, namely the mouse brain. Our investigation is made possible by the recent release of the mouse brain connectivity data from the Allen Mouse Brain Connectivity Atlas (Allen Institute for Brain Science, 2012d). By integrating this resource with the Allen Mouse Brain Atlas data (Allen Institute for Brain Science, 2012b, Lein et al., 2007), we attempt to systematically study the relationship between gene expression and brain connectivity in a single mammalian brain. We employ ensemble learning methods (Zhou, 2012) for predicting the brain connectivity using gene expression data. These methods generate many base models by randomly sampling the original training data, thereby yielding accurate and robust predictions (Geremia et al., 2011, Gray et al., 2013, Liu et al., 2012, Yuan et al., 2012b). We consider two types of base models in this work, that is, tree and sparse models, which have been commonly used in neurological applications (Cribben et al., 2012, de Brecht and Yamagishi, 2012, Geremia et al., 2011, Gray et al., 2013, Ryali et al., 2010, Ye et al., 2012). One common and appealing property of these models is that they can perform feature selection and prediction simultaneously, thereby enabling us to identify genes that retain high predictive power.
Our results show that gene expression is predictive of connectivity in the mouse brain when the connectivity signals are discretized. When the expression patterns of 4084 genes are used, we obtain a predictive accuracy of 93%. Our results also show that the expression patterns for a small number of genes can almost give the full predictive power of using thousands of genes. We can achieve a prediction accuracy of 91% by using the expression patterns of only 25 genes. Gene ontology analysis of the highly ranked genes shows that they are significantly enriched for connectivity related processes. We also performed covariation analysis on the gene expression and connectivity data. Our results show that gene expression and connectivity are correlated in the mouse brain. We show that our results on prediction and covariation analysis are significant when the spatial autocorrelation effects are considered.
Section snippets
Allen Mouse Brain Connectivity Atlas
The Allen Mouse Brain Connectivity Atlas (the Connectivity Atlas) provides 3-D, high-resolution maps of neural connections in the adult mouse brain (Allen Institute for Brain Science, 2012d). In this atlas, axonal projections mapped from major anatomical regions are labeled by recombinant adeno-associated virus tracers and visualized using serial two-photon tomography. The primary data consist of high-resolution images that capture the axonal projections from anatomic regions throughout the
Results and discussion
In this section, we report the results of brain connectivity prediction using ensemble methods. For each prediction task, we apply five-fold cross validation and use the area under the ROC curve (AUC) as the performance measure (Kaufman et al., 2006, Wolf et al., 2011). In this procedure, the samples are split into five (approximately) equally-sized subsets. Four subsets are used to train a model, and the fifth subset is used for performance evaluation. This process is iterated five times so
Conclusions
In this work, we investigate the relationship between gene expression and structure-level connectivity in the mouse brain. We employ two types of ensemble models, i.e., ensemble of trees and ensemble of sparse models, for predicting brain connectivity using gene expression data. Our results show that gene expression is predictive of connectivity in the mouse brain when the connectivity signals are discretized. In addition, we show that the expression data for a small number of genes can achieve
Acknowledgments
We thank the Allen Institute for Brain Science for making the Allen Brain Atlas data available. We thank Chinh Dang, David Feng, Leon French, Terri Gilbert, Chen Goldberg, Michael Hawrylycz, Nathan Manor, Luis Puelles, Carol Thompson, and Lior Wolf for assistance in interpreting the data and results. This work is supported by the National Science Foundation grant DBI-1147134 and by the Old Dominion University Office of Research.
References (70)
- et al.
Dynamic connectivity regression: determining state-related changes in brain connectivity
Neuroimage
(2012) - et al.
Combining sparseness and smoothness improves classification accuracy and interpretability
Neuroimage
(2012) - et al.
Spatial decision forests for MS lesion segmentation in multi-channel magnetic resonance images
Neuroimage
(2011) - et al.
Autism spectrum disorders: developmental disconnection syndromes
Curr. Opin. Neurobiol.
(2007) - et al.
Random forest-based similarity measures for multi-modal classification of Alzheimer's disease
Neuroimage
(2013) - et al.
Reduced frontotemporal functional connectivity in schizophrenia associated with auditory hallucinations
Biol. Psychiatry
(2002) - et al.
Ome sweet ome: what can the genome tell us about the connectome?
Curr. Opin. Neurobiol.
(2008) - et al.
Ensemble sparse classification of Alzheimer's disease
Neuroimage
(2012) - et al.
Sparse logistic regression for whole-brain classification of fMRI data
Neuroimage
(2010) - et al.
Genomic anatomy of the hippocampus
Neuron
(2008)