Introduction

Alzheimer Disease (AD) is one of the most devastating neurodegenerative diseases, affecting memory as well as cognitive functions of the human brain. With the absence of immediate treatment for patients diagnosed with AD, an accurate diagnosis of AD in an earlier stage propels early clinical interventions that could help slow down irreversible cognitive decline. Specifically, an intermediate stage exists between AD and normal control (NC) which is Mild Cognitive Impairment (MCI), where unlike AD, the memory deficits in MCI patients may remain relatively stable for years. The MCI stage is regarded as very critical as patients can still benefit from adequate clinical interventions before conversion to AD.

Considering the increasing number of brain imaging datasets on dementia and particularly AD, several methods based on neuroimaging processing and machine-learning have been developed in the purpose of early detection of AD conversion at the MCI stage. Hence, detecting brain biomarkers in the stage of MCI may allow the individualization of effective treatment to demented patients. Remarkably, the majority of these methods have extensively relied on resting state functional and diffusion magnetic resonance imaging (MRI)1,2,3,4,5,6,7,8. Some works proposed brain network analysis methods using noninvasive diffusion MRI for AD diagnosis, where structural connectivities were measured using the degree of white matter connectivity between the associated pairs of ROIs4,8. On the other hand, several studies used functional brain networks which mostly focused on characterizing the pairwise correlation (e.g., Pearson Correlation) between ROIs. Recently, more advanced studies proposed novel functional connectivity (FC) representations to model brain networks at different connectional levels. For example, Yu et al. proposed a novel method to construct brain FC by taking advantage of both Pearson Correlation and sparse learning1. While some works only used high-order FC3 considering the relationships between pairs of ROIs, more recent studies integrated both low-order and high-order FC networks along with interactions between the two levels7. However, analysis of functional networks is typically limited by the choice of a single or multiple thresholds for examining network topology, which may discard many important and discriminative brain connectivities. Moreover, while functional MRI can produce spurious and noisy connectomes, diffusion MRI can produce biased and largely variable structural connectomes depending on the employed fiber tractography method9. Besides, both structural and functional modalities are rarely acquired in a conventional clinical routine. Additionally, distinguishing between late MCI (LMCI) patients, who might be on the verge to convert to AD, and AD patients is a much more challenging classification task than that of AD vs. NC or early MCI (EMCI) vs. AD. Due to the very subtle brain changes between LMCI and AD brain changes, LMCI/AD classification task remains a hard problem to solve, that has been hardly addressed in the AD literature10,11.

On the other hand, many other studies have demonstrated the importance of considering cortical measures derived from the multi-folded surface of the cerebral cortex for AD diagnosis, such as the cortical thickness12,13,14,15. Specifically, cortical thickness is considered as a biomarker of AD progression, which provides insight into normal brain development and neurodegenerative disorders since it is correlated with changes in cognitive performance15,16,17,18. For instance, Frisoni et al. showed reduction in cortical thickness in AD subjects compared with control subjects15. Thus, many voxel-based methods16,17 or region-based methods18 heavily relied on morphological features, including volumetric cortical thickness measurements from MRI, for AD diagnosis. However, all these methods were based on volumetric cortical thickness analysis, while there is evidence that AD alters not only volume-based cortical measures, but also the shape of cortical regions -e.g., cortical thinning at different levels19. For this reason, other studies explored cortical thickness using surface-based methods involving spectral shape description20, or combining shape-derived features with voxel features21,22. However, these approaches considered the morphological features at only the vertex-level. To the best of our knowledge, none of existing network-based analysis methods for disentangling late AD states investigated the morphological connectivities between ROIs using structural T1-w MRI-i.e., modeling how the morphology of different brain regions may be affected in relation to one another. Moreover, since AD may affect the complex relationships between a set of attributes in different cortical regions, one cannot rely on a single cortical attribute to examine how the brain is progressively altered by different stages of AD. A more comprehensive approach would consider multiple cortical attributes (e.g., sulcal depth, cortical thickness), each of these representing a single view of cortex morphology to quantify brain morphology. In this paper, we propose the first multi-view morphological brain connectivity using four different cortical attributes: cortical thickness network, sulcal depth network, average curvature network, and principal curvature network. Then, based on this multi-view connectional representation of brain morphology, we further propose novel network architecture that would allow us to investigate the complex relationship between these views for identifying late MCI morphological connectional biomarkers distinguishing between LMCI and AD.

Typically, the majority of network-based methods developed for MCI/AD classification diagnosis overlooked the high-order relationship between different brain connectional layers. A few recent network-analysis works proposed for classification tasks between different AD stages (e.g., early MCI, late MCI), used one-layer network representation23, a multi-layer (i.e. set of concatenated networks) network24 or high-order networks10,11. Specifically, the recently proposed high-order functional connectivity networks for MCI/AD diagnosis23 integrated new high-level features that encode how different brain region pairs, instead of two brain regions, functionally interact with each other. Nevertheless, it will be possible to further consider other new connections through exploring how different network pairs interact with each another (and not only brain region pairs). This nicely led us to the concept of a multiplex network, which was historically coined to indicate the presence of more than one relationship between the same actors of a social network25. Some previous methods26,27,28,29,30 have explored multiplexes to study brain networks (e.g., structural, functional). These multiplex networks (or multiplexes) allow multiple types of relationships to be represented in modelling brain connectivities, thus capturing higher levels of complexity between brain regions. However, all the mentioned studies investigated multiplex as a multi-layer network without exploring similarity networks that encode the relationship between consecutive brain connectivity layers. For instance, Battiston et al. used multiplexes as a two-layer network (functional and anatomical) to extract brain subgraphs while overlooking the inter-layer that perform high-order connectivities27. Moreover, these approaches either relied on fMRI, combine fMRI with structural MRI, or used different modalities such as MRI with PET26; but none explored morphological brain network each based on a specific attribute of the cortical surface, with the notable exception of recent works31,32 targeting early dementia and autism spectrum disorder diagnosis.

To address this limitation, we further propose a morphological brain multiplex interleaving a set of two different layers: an intra-layer which represents the morphological connectivity network of a specific cortical attribute, and an inter-layer (or a similarity layer) which computes the Pearson Correlation between two consecutive intra-layers. The proposed architecture leverages both morphological networks and the correlational relationship between each two consecutive layers. However, different similarity networks can be extracted by varying the order of layers. Hence, we define an ensemble of morphological brain multiplexes, each capturing complex network-to-network relationships for predefined set of cortical attributes by reordering at each time different intra-layers, with the exception of the first intra-layer, to capture new similarity networks. We aim by this architecture to discover morphological connectional biomarkers distinguishing between AD and LMCI patients, which can be clinically useful for early detection of AD conversion at MCI stage.

Results

Data processing and parameters

In our study, we used 77 subjects (41 AD and 36 LMCI) from ADNI GO public dataset, each with structural T1-w MR image33. Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer’s disease (AD). We used FreeSurfer processing pipeline34 to reconstruct both right (RH) and left (LH) cortical hemispheres for each subject from T1-w MRI35. Then we parcellated each cortical hemisphere into 35 cortical regions using Desikan-Killiany cortical atlas35. Using FreeSurfer pipeline, each vertex on the cortical surface was assigned four cortical attributes: maximum principal curvature, cortical thickness, sulcal depth, and average curvature.

For the deep similarity network architecture, we used two levels (l = 0, l = 1). We defined K = 6 multiplexes using 4 cortical attributes, where multiplex \({ {\mathcal M} }_{1}\) includes morphological networks generated using different cortical attributes {N1, N2, N3, N4}, \({ {\mathcal M} }_{2}\) includes {N1, N2, N4, N3}, \({ {\mathcal M} }_{3}\) includes {N1, N3, N4, N2},\(\,{ {\mathcal M} }_{4}\) includes {N1, N3, N2, N4},\(\,{ {\mathcal M} }_{5}\) includes {N1, N4, N2, N3}, and \({ {\mathcal M} }_{6}\) includes {N1, N4, N3, N2}. For each cortical region, N1 denotes the mean maximum principal curvature, N2 denotes the mean cortical thickness, N3 denotes the mean sulcal depth and N4 denotes the mean of average curvature.

Data distribution

Table 1 displays the gender/age distribution for both AD and LMCI groups. Both groups were matched in gender and age.

Table 1 Data distribution. M: male. F: female. Total: total number of subjects in each group. Std: standard deviation.

Comparison methods

We compared our proposed architectures with two conventional methods: (1) one-layer network architecture, and (2) concatenated multi-layer network architecture. For the first baseline method, we used the designed M morphological brain networks. For the second baseline method, we constructed the multi-layer network through concatenating all morphological networks in a large network \({\mathscr{N}}=\{{N}_{1},\mathrm{..},{N}_{M}\}\) of size R × R × M (R = 35, M = 4).

Evaluation

We evaluated our framework through varying the number of K f selected features from 180 to 250 by adding 10 features at each evaluation step (Fig. 1). We noted that for the majority of the selected features’ dimensions, the deep similarity architecture increased the classification performance in comparison with the conventional methods (i.e., one-layer network and concatenated 4-layer network), for both left and right hemispheres. Remarkably, the classification accuracies highly improved when using particular brain multiplexes. Table 2 displays the average classification accuracies for baseline methods as well as for the proposed architectures. The best average accuracy was achieved by multiplex 6 (respectively including {N1, N4, N3, N2} cortical attributes as intra-layers) for LH (68.61%), while it was achieved by multiplex 5 (respectively including {N1, N4, N2, N3} cortical attributes as intra-layers) for RH (72.25%).

Figure 1
figure 1

Influence of the selected features number on the accuracy of both baseline and proposed methods. (A) and (B) plot the accuracy curves against the number of selected features for deep similarity network compared with baseline methods for RH and LH, respectively. (C) and (D) plot the accuracy curves against the number of selected features for the 6 proposed multiplexes of right and left hemispheres respectively.

Table 2 LMCI/AD average classification accuracy for the proposed morphological network architectures by varying the number of top selected features from 180 to 250, with an incremental step of 10 features. ACC: accuracy. SEN: sensitivity. SPE: specificity.

Notably, the best accuracies were obtained using a number of features equal to 220 and 210 for RH and LH, respectively (Fig. 2). When using one-layer morphological networks, the classification accuracy only reached 63.64% for LH and 66.23% for RH. These classification results decreased when concatenating features extracted from all morphological networks (level 0), as the classification rate was limited to 59.74% for RH and 58.44% for LH. However, when we further integrated the similarity networks between all pairs of layers in the proposed deep similarity network architecture (level 1), the performance significantly increased for both RH (67.53%) and LH (64.94%) compared to one-layer network and concatenated one-layer networks architectures. For both hemispheres, the highest accuracies were achieved using multiplex \({ {\mathcal M} }_{6}\), multiplex \({ {\mathcal M} }_{5}\), and multiplex \({ {\mathcal M} }_{1}\). Specifically, multiplex \({ {\mathcal M} }_{6}\) achieved the best accuracy among all architecture networks with a classification accuracy peaking at 77.92% for RH and 71.43% for LH. This shows that the proposed similarity networks allow to better discriminate between LMCI and AD subjects. This was also reflected by the percentages of discriminative features belonging to the similarity inter-layer networks for each multiplex as shown in Table 3. We note that for some multiplexes, more than 50% of K f discriminative features lied in the network inter-layers -i.e., similarity networks.

Figure 2
figure 2

Best performances reached respectively by right hemisphere RH and left hemisphere LH, using different architecture networks. ‘A’ denotes cortical attribute, ‘DS’ denotes deep similarity, and ‘M’ denotes Multiplex.

Table 3 Percentage of discriminative features belonging to the similarity inter-layers and the intra-layers for each of the proposed multiplexes.

Identified morphological connectional biomarkers for LMCI/AD classification

We further explored our multiplex architecture and morphological networks to identify morphological connectional biomarkers that discriminate between LMCI and AD patients. Since we aimed to find the most discriminative morphological connections, we chose the brain multiplex with the highest discriminative power. For LH, we found that the discriminative power of multiplex \({ {\mathcal M} }_{6}\) was the most reproducible since it gave the best average accuracy across different numbers of selected features as well as the best accuracy (for K f  = 210) in comparison with all other network architectures. As for RH, multiplex \({ {\mathcal M} }_{5}\) achieved the best mean average across different numbers of selected features, while we noted the highest accuracy reached by multiplex \({ {\mathcal M} }_{6}\) for a number of features equal to 220. Hence, we selected multiplex \({ {\mathcal M} }_{6}\) achieving the best accuracies for both hemispheres to discover morphological connectional biomarkers, using the specific number of discriminative features 220 (77.92%) and 210 (71.43%) for RH and LH, respectively.

In Fig. 3, we visualized using circular graphs the top most frequently selected morphological brain connectivities in multiplex \({ {\mathcal M} }_{6}\). Circular graphs were plotted for the top 10, 15 and 20 discriminative features, respectively. The thickness of each edge connecting a pair of ROIs represents the normalized rank of the discriminative brain connection. The most discriminative connections with the highest normalized ranks have thick edges, while those with less discriminative power have thinner edges. Blue edges denote connections belonging to a multiplex inter-layer, while red edges denote connections falling into a multiplex intra-layer.

Figure 3
figure 3

Most discriminative morphological cortical network connections between LMCI and AD for RH and LH, respectively.

We noted that 20% (resp. 50%) of the top 10 discriminative features were located in the multiplex inter-layers for RH (resp. LH). Using the normalized ranks, the most discriminative connectional features for RH connected the Entorhinal Cortex (EC) (region 6) and the Caudal Middle Frontal Gyrus (CMFG) (region 3), EC and Temporal Pole (TP) (region 33), EC and Frontal Pole (FP) (region 32), EC and Bank of the Superior Temporal Sulcus (BSTS) (region 1) and the fifth connectivity was between the Paracentral Lobule (PL) (region 17) and Caudal Anterior-cingulate Cortex (CAC) (region 2), respectively. As for the LH, the most discriminative features connected the EC and the Rostral Middle Frontal Gyrus (RMFG) (region 27), EC and Lingual Gyrus (LG) (region 13), EC and Postcentral Gyrus (region 22), EC and CAC, and PL and Precentral Gyrus (region 24), respectively. We noted that the most discriminative morphological hub node in multiplex \({ {\mathcal M} }_{6}\) for both hemispheres was the entorhinal cortex, where the top four discriminative connections with the highest normalized ranks branched from it.

Besides, even when we increased the number of discriminative features from 10 to 20, new connectivities appeared, most of them emerged from the EC. Moreover, the percentage of the top discriminative features belonging to inter-layers was low for RH (~15%) compared to LH (~45%). We also note that the majority of the most discriminative morphological brain connections fell into the 5th layer (A3), which represents the mean sulcal depth attribute (Tables 4 and 5).

Table 4 Top 10 discriminative morphological connections in the cortex with their corresponding layers for the right hemisphere.
Table 5 Top 10 discriminative morphological connections in the cortex with their corresponding layers for the left hemisphere.

More importantly, while most discriminative features belonging to intra-layers emerged from EC for both left and right hemispheres, those belonging to inter-layers emerged from the fusiform gyrus and the paracentral lobule for RH. The same regions were present in LH with new other hub nodes including the precentral gyrus, transverse temporal cortex, rostral anterior cingulate cortex, and isthmus-cingulate cortex.

Discussion

We proposed a novel representation of brain connectivity to identify connectional biomarkers based on the morphology of the cerebral cortex for distinguishing between late mild cognitively impaired patients and Alzheimer’s disease patients. In this study, we unprecedentedly investigated the role of several morphological connectivity networks as well as the correlation between them to discover morphological connectional brain biomarker fingerprinting the difference between LMCI and AD states. In particular, we proposed two brain architectures: the deep similarity network and the multiplex network. While in the first architecture we simply concatenated all possible similarity networks with the main morphological network, in the second one we constructed similarity networks only between successive layers, and generated different multiplexes by reordering the morphological layers.

Our proposed architectures achieved better performances than one-layer morphological network and concatenated 4-layer networks. This shows that the aggregation of different similarities between morphological brain connections helps better discriminate between AD and LMCI patients compared to using a single morphological network or even all morphological networks without exploring their relationships (Fig. 2). Moreover, our multiplex architecture achieved the highest accuracies, which indicates that the similarity inter-layers between morphological networks are able to capture a higher-level discriminative information. This demonstrates that disease-driven changes in the cortical shape quantified using a specific cortical attribute can be also influenced by shape changes measured using a different cortical attribute.

Through using different cortical attributes and identifying the most of discriminative features by multiplex \({ {\mathcal M} }_{6}\), we found that the mean sulcal depth has the highest discriminative power (Tables 4 and 5). Sulcal depth has been identified in the literature as one of the quantitative measures of cerebral cortex, representing an important morphological biomarker for AD36,37. Im et al. presented a surface-based method that investigated changes of sulcal shape in MCI and AD, using sulcal depth and average mean curvature36. They showed that the progression of disease from NC to MCI and MCI to AD was coupled with shallowness in sulcal depth. The same finding37 was replicated by Yun et al., which proposed an automated sulcal depth measurement on cortical surface and highlighted that mean sulcal depth in MCI was lower than in NC.

The most discriminative morphological connectivities with the highest normalized ranks were established between EC and CMFG for RH, and EC and RMFG for LH. Many studies highlighted that RMFG is a discriminative region in AD diagnosis as well as CMFG38. It was also noted that about 18% of the CMFG atrophies in AD patients38.

One of the major findings of our study is the detection of morphological brain connectional biomarkers fingerprinting the distinction between LMCI and AD dementia brain states. We found that 85% (resp. 65%) of most RH (resp. LH) discriminative regions connected to the EC fingerprint LMCI/AD classification (Fig. 3). The EC has a major role in working memory processing39,40,41,42,43. Its importance was revealed due to its anatomical interconnection with the hippocampus, which is the major region responsible of memory formation43,44. EC role consists of generating coding schemes for new memories and storing them temporarily. It has numerous reciprocal connections with the hippocampus, specifically an effective connectivity in the hippocampus strongly depends on the connectivity among EC layers45.

Our findings based on cortical morphological connectivity were in line with previous studies, since the EC has been considered as a good biomarker for AD and MCI in the literature46,47,48,49,50. It has a great potential for detecting early memory decline and is considered as the region of early neurodegeneration caused by dementia. Velayudhan et al. examined the relationship between EC thickness, hippocampal volume and the whole brain volume, and showed that AD patients have thinner EC thickness and smaller hippocampal volume compared with MCI subjects46. The same hypothesis about the role of EC was demonstrated by the work of Thaker et al.47, which considered EC thickness as a marker of medial temporal and neocortical AD neuropathology. The review paper48 also highlighted the early EC atrophy detection as an important anatomical marker for MCI and AD, since it was remarkably highly correlated with the early pathological changes in AD. Besides, greater changes in the right EC were present compared with the left one, which substantiates our results, since we achieved the best multiplex-based LMCI/AD classification performance using the right hemisphere.

Our study has a few limitations. First, although we used different types of morphological attributes, we simply concatenated all derived connectivities to extract features without creating fused predictors of disease diagnosis. Second, though we identified key morphological connectional biomarkers for LMCI stage, mainly involving the entorhinal cortex, we did not investigate the connection of the discovered cortical regions to other non-cortical regions (e.g., EC to hippocampus). It is still not clear how the shape-based morphological connectivities of EC can be altered with the hippocampus connectivities such as functional or structural. Third, since MCI is a progressive disease, tracking the discriminative power of the identified morphological biomarkers can help better understand how the morphology of a specific discriminative region (e.g., EC thickness) gets altered progressively with time in cognition in relation to other cortical attributes42,44. Fourth, although the structural underpinning of morphological networks remains unclear, the co-vary brain regions were suggested as a result of mutually trophic influences or common experience related plasticity51,52. In particular, it was noted that the pattern of cortical thickness correlation of certain brain regions is similar to the underlying fiber connections from DTI tractography53. Gong et al. also pointed out that approximately 35–40% convergent connections exist between brain networks using thickness and diffusion measurements, which suggests that thickness correlations include exclusive information54.

In our future work, we will investigate longitudinal morphological connectivities to improve our framework as well as longitudinal morphological changes in the EC. Besides, we will use advanced methods for different morphological and similarity networks fusion55 while integrating other multimodal brain networks (e.g., resting-state functional networks and structural diffusion networks) into our proposed brain multiplex architectures26.

Methods

We first introduce our morphological brain network construction strategy from structural T1-w MRI. Then, we propose two different architectures to explore the relationship between multiple brain connectivity morphological views: (1) a deep multi-level similarity network that aggregates different morphological brain networks with hierarchical combinations of similarity networks between them; and (2) morphological brain multiplex network, which is defined through inserting additional inter-layers between the aggregated networks. Last, we perform feature extraction and selection to classify a testing subject, and morphological biomarker identification. Figure 4 displays the key steps of the proposed framework.

Figure 4
figure 4

Illustration of AD/LMCI classification framework steps for the proposed cortical morphological network architectures. (A) We generate different morphological networks, each derived from a specific attribute of the cortical surface shape. (B) For each multi-layer network, we extract features from the triangular part of each cortical connectivity matrix. (C) For feature selection, we use IFS strategy (Roffo et al.56), then we train a linear support vector machine classifier using the selected connectional features.

Morphological Brain Network Definition

Following the cortical surface parcellation into R anatomical regions, for each ROI R i , we average the cortical attribute a across all vertices v in R i as follows:

$$\frac{1}{\#\{v\in {R}_{i}\}}\sum _{v\in {R}_{i}}a(v),$$

where #{vR i } denotes the number of vertices v belonging to ROI R i , and a(v) the cortical attribute value assigned to vertex v. Ultimately, to define the morphological connection N a (i,j) in network N a between ROIs R i and R j , we compute the absolute difference between averaged cortical attributes in both ROIs:

$${N}_{a}(i,j)=|\frac{1}{\#\{v\in {R}_{i}\}}\sum _{v\in {R}_{i}}a(v)-\frac{1}{\#\{v\in {R}_{j}\}}\sum _{v\in {R}_{j}}a(v)|.$$

Given R cortical regions in each hemisphere, the size of each fully connected morphological network is R × R. We note that according to our definition, as two ROIs R i and R j become more similar in morphology, N a (i, j) tends to 0.

Proposed Morphological Network Architectures

To extract relevant and high-order morphological features from a set of M morphological cortical brain networks {N1, .., N M }, each encoding a specific shape attribute of the cortical surface, we propose ‘simple-to-complex’ strategies for building network architectures that capture different characteristics of how these networks interact with one another. In particular, high-order network architectures aim to reflect how these networks are nested with respect to one another in a high dimensional manifold of networks.

Proposed deep similarity network architecture construction

We first propose a deep multi-level network architecture, where each level integrates the similarity networks between all pairs of networks in the previous level. The relationship between pairs of networks is defined by the measure of Pearson correlation (Fig. 5A). Thus, we define the degree of correlation between different cortical networks at each level. The baseline level (l = 0) is composed of all concatened networks \({{\mathscr{N}}}^{0}=\{{N}_{1}^{0},\mathrm{..},{N}_{M}^{0}\}.\) To build the next level, we create a larger multi-layer network through concatenating n s similarity networks, where \({n}_{s}={C}_{M}^{2}=M!/(M-2)!2!\), representing the number of possible pairwise combinations between M networks. This produces a new deeper network \({{\mathscr{N}}}^{1}={{\mathscr{N}}}^{0}\cup \{{S}_{1,2},\ldots ,{S}_{pq},\ldots {S}_{M-1,M}\}\), where p and q represent the indices of two different networks in \({{\mathscr{N}}}^{0}\). For brevity, we note the baseline network at a specific level l as \({{\mathscr{N}}}^{l}=\{{N}_{1}^{l},\ldots ,{N}_{{M}_{l}}^{l}\}\), where M l represents the total number of level l networks. Hence, in the next level (l + 1), we consider \({{\mathscr{N}}}^{l}\) as the baseline network, and add the similarity networks at a specific level l + 1 as: \({{\mathscr{S}}}^{l+1}=\{{S}_{1,2},\ldots ,{S}_{pq},\ldots {S}_{{M}_{l}-1,{M}_{l}}\}\), where S pq represents the similarity network between networks \({N}_{p}^{l}\) and \({N}_{q}^{l}\). From level to level, we gradually add similarity networks between networks in the previous level (including similarity networks), thereby producing deeper networks from one level to the next one, where \({{\mathscr{N}}}^{l+1}={{\mathscr{N}}}^{l}\cup {{\mathscr{S}}}^{l+1}\) (Fig. 5C). The deep multi-level similarity network architecture is thus constructed in a hierarchical way, which captures not only network-to-network similarities, but also ‘similarity-to-similarity’ similarities.

Figure 5
figure 5

Proposed deep similarity network and multiplex network architectures, with illustration of the similarity network construction step. (A) We generate the similarity network between two morphological attribute networks by computing Pearson Correlation between them. (B) We construct the multiplex architecture where each inter-layer is a similarity network. (C) We consider the inter-relations between \({{\mathscr{N}}}^{l}\) networks through progressively concatenating, from a previous level l to a current level (l + 1), all possible similarity networks between pairs of networks.

Proposed ensemble multiplex network architecture construction

Although the proposed deep similarity network architecture allows to explore similarities between networks at different hierarchical levels, this aggregates the similarity networks at the end of previous multi-layer network, in an agglomerative manner without enabling us to take account into the most correlated pairs of morphological networks. To enforce a more structured design of networks and their similarities, we propose to use a multiplex network to model the inter-relations between different layers. In a generic way, we define a brain multiplex \( {\mathcal M} \) as a set of M intra-layers {N1, .., N M } (i.e., morphological networks), where between two consecutive layers N i and N j , we slide an inter-layer Si,j. This yields to following multiplex architecture: \( {\mathcal M} =\{{N}_{1},{S}_{1,2},{N}_{2},\ldots ,{N}_{i},{S}_{i,j},{N}_{j},\ldots ,{N}_{M}\}\) (Fig. 5B). Unlike the previous architecture (Fig. 5C), we note that for a specific multiplex, we are only allowed to explore similarities between consecutive layers. We also use Pearson Correlation to generate inter-layers as the deep similarity network architecture. Hence, to explore the inter-relationship between all possible combinations of layers for each subject, we generate K multiplexes through simply reordering the intra-layer networks while fixing the first intra-layer, thereby generating an ensemble of multiplexes \({\mathbb{M}}=\{{ {\mathcal M} }_{1},\ldots ,{ {\mathcal M} }_{K}\}\). Each of these multiplexes captures specific similarities between different kinds of morphological networks (e.g., sulcal depth network and cortical thickness network) that may not be present in another brain multiplex.

Network feature extraction and selection for classification

For each of the proposed subject-specific network architectures in the previous section, we perform feature extraction, selection and LMCI/AD classification as follows.

Feature extraction

To explore the discriminative power of each region-to-region morphological connectivity in the cortex, we directly use the weights of edges in the morphological network as connectional brain features.

Since the constructed morphological connectivity matrix (or network N a ) is symmetric (Fig. 4A), connectional features are extracted for each subject through concatenating the weights of all connectivities in each triangular matrix. Of note, for each network of size R × R, we extract a feature vector of size (R(R − 1)/2). For the deep similarity network and multiplex architectures, we extract features from each network in the architecture, then concatenate them all together into a high-dimensional feature vector. For a network architecture comprising M n networks, the size of the final feature vector is M n  × (R(R − 1)/2).

Feature selection and classification

Due to the high-dimensionality of the extracted feature vectors and the small number of data samples, feature selection is a key step in classification tasks to both reduce the dimension of the training feature vectors and single out the most discriminative features. To this aim, we train a support vector machine (SVM) classifier using leave one-out cross-validation (LOO-CV) strategy. Given P subjects, we apply Infinite Feature Selection (IFS)56 in a supervised manner using the (P − 1) training subjects to select the top K f features that significantly distinguish LMCI from AD patients. The most frequently selected features across different cross-validation schemes represent the morphological connectional biomarkers that allow to distinguish between AD and LMCI patients.

IFS method is a filter-based algorithm that aims to avoid over-fitting in a high-dimensional data by not considering irrelevant and/or redundant features. Compared with other feature selection methods, IFS has a compelling aspect that allows to efficiently identify reliable distinctive features for classification tasks. Most feature selection methods, which rank and select features, evaluate the importance of each feature individually, usually by neglecting potential interactions among the elements of the joint set. However, IFS performs joint ranking with selection and the score attributed for each feature in influenced by all other features. The idea is based on building a graph for the feature distribution, where the vertices denote the features and the edges represent the pairwise relationships among the feature distribution. Then, the algorithm ranks different morphological features by their importance and discriminative power. It evaluates the importance of a given feature while considering all the possible subsets of features. Given the output indices of the ranked features, we select the top K f ranked features to train a linear SVM classifier using LOO-CV strategy to assign a label (LMCI or AD) to a new testing subject (Fig. 4C).

Identification of morphological connectional biomarker

To identify morphological connectional biomarkers, we select the top n f indices of K f discriminative features ranked by IFS56 across all PLOO-CV. Specifically, we generate a matrix of size n f  × P, where each row represents the top ranked indices of features and each column represents a specific rank. For a given feature f k , we calculate its normalized rank across different LOO-CV as follows: \(r({f}_{k})=(\sum _{i=1}^{P}\sum _{j=1}^{{n}_{f}}{\delta }_{ij}{w}_{ij}({f}_{k}))/P\), where δ ij  = 1 if f k is selected, δ ij  = 0 otherwise. The weight w ij (f k ) denotes the corresponding weight of feature f k assigned by IFS, at the ith LOO and jth rank. Next, we identify the connectional biomarkers as features with the top normalized ranks.

Availability of materials and data

The data that support the findings of this study are available from ADNI data (http://adni.loni.usc.edu/). For reproducibility and comparability, the authors will make available upon request all morphological networks generated based on the four cortical attributes (maximum principal curvature, cortical thickness, sulcal depth, and average curvature) for 77 subjects (41 AD and 36 LMCI) following the approval by ADNI Consortium. The Matlab code for generating an ensemble of multiplexes using M brain networks for a single subject (e.g., morphological, structural, or functional) is also available from the authors upon request.