Machine Learning and DWI Brain Communicability Networks for Alzheimer’s Disease Detection

Lella, Eufemia; Lombardi, Angela; Amoroso, Nicola; Diacono, Domenico; Maggipinto, Tommaso; Monaco, Alfonso; Bellotti, Roberto; Tangaro, Sabina

doi:10.3390/app10030934

Open AccessFeature PaperArticle

Machine Learning and DWI Brain Communicability Networks for Alzheimer’s Disease Detection

¹

Istituto Nazionale di Fisica Nucleare, Sezione di Bari, via E. Orabona, 70125 Bari, Italy

²

Dipartimento Interateneo di Fisica, Università degli Studi di Bari, via E. Orabona, 70125 Bari, Italy

³

Dipartimento di Farmacia—Scienze del Farmaco, Università degli Studi di Bari, via E. Orabona, 70125 Bari, Italy

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2020, 10(3), 934; https://doi.org/10.3390/app10030934

Submission received: 11 December 2019 / Revised: 15 January 2020 / Accepted: 21 January 2020 / Published: 31 January 2020

(This article belongs to the Special Issue Computer-aided Biomedical Imaging 2020: Advances and Prospects)

Download

Browse Figures

Versions Notes

Abstract

:

Signal processing and machine learning techniques are changing the clinical practice based on medical imaging from many perspectives. A major topic is related to (i) the development of computer aided diagnosis systems to provide clinicians with novel, non-invasive and low-cost support-tools, and (ii) to the development of new methodologies for the analysis of biomedical data for finding new disease biomarkers. Advancements have been recently achieved in the context of Alzheimer’s disease (AD) diagnosis through the use of diffusion weighted imaging (DWI) data. When combined with tractography algorithms, this imaging modality enables the reconstruction of the physical connections of the brain that can be subsequently investigated through a complex network-based approach. A graph metric particularly suited to describe the disruption of the brain connectivity due to AD is communicability. In this work, we develop a machine learning framework for the classification and feature importance analysis of AD based on communicability at the whole brain level. We fairly compare the performance of three state-of-the-art classification models, namely support vector machines, random forests and artificial neural networks, on the connectivity networks of a balanced cohort of healthy control subjects and AD patients from the ADNI database. Moreover, we clinically validate the information content of the communicability metric by performing a feature importance analysis. Both performance comparison and feature importance analysis provide evidence of the robustness of the method. The results obtained confirm that the whole brain structural communicability alterations due to AD are a valuable biomarker for the characterization and investigation of pathological conditions.

Keywords:

computer aided diagnosis; Alzheimer’s disease; machine learning; brain connectivity

1. Introduction

Alzheimer’s disease (AD) is the most widespread neurodegenerative disorder and is a growing health problem. It is mainly characterized by short-term memory loss in its earlier stages, followed by a progressive decline in other cognitive and behavioural functions as the disease advances [1]. Investigating useful biomarkers for the early diagnosis, prognosis and response to therapy is one of the primary goals of the current research activity in neuroscience.

A number of studies provided evidence that the decline due to AD is related to a disrupted connectivity among brain regions, caused by white matter (WM) degeneration, e.g., [2,3]. Due to their homogeneous chemical composition, conventional MRI is not able to highlight the structure of the WM fibers, therefore, it is not tailored to investigate the physical disconnections arising among them. Conversely, a promising technique for such an investigation is diffusion weighted imaging (DWI). This technique, in fact, is able to analyze the WM micro-structural integrity and can thus help identify WM alterations that may occur due to AD [4].

Different approaches have been proposed to study the diagnostic potential of DWI data, ranging from a finer voxelwise analysis [5] to an ROI-based approach [6]. In the last few years, a growing interest has arisen towards the application of an alternative approach based on complex network theory. When combined with tractography algorithms [7], in fact, DWI enables the reconstruction of the WM fiber tracts, providing a characterization of the physical connections of the brain that can be subsequently investigated through a complex network-based approach [8]. More precisely, the brain can be modeled as a network whose nodes are the anatomical regions and whose edges are related to the fiber tracts connecting them. Traditional network metrics suitable for describing topological properties of the brain include nodal degree and strength, and shortest path length.

In particular, a very promising research direction consists in feeding graph-based topological measures into machine learning algorithms so as to automatize the disease detection, e.g., [9,10,11]. Developing computer aided diagnosis systems is desirable, as they can provide a non-invasive, low-cost tool-support to the traditional neuropsychological assessment performed by expert clinicians. Moreover, a great variety of state-of-the-art machine learning approaches, has shown outstanding performance for early detection and automated classification of Alzheimer’s disease (AD) [12,13].

Recently, we investigated the usefulness in this context of an uncommon graph measure, that is communicability. Communicability quantifies the ease of communications between node pairs in a network by considering not only the shortest path connecting them, but all possible available routes [14]. For this reason, this metric revealed to be particularly sensitive to the disruption of communication between brain regions due to AD [15,16]. In [15], communicability was able to outperform more classic graph measures on a mixed cohort of healthy control (HC) subjects and AD patients from the public Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu/). Since the main goal was to compare communicability to other classic measures, we fixed the classification model to be used, i.e., support vector machines. Furthermore, a cortical parcellation scheme was adopted for the estimated brain networks. On one hand, the use of only one classifier may be not enough to claim the robustness of the communicability-based approach for supporting the automatic disease detection. On the other hand, the use of a coarse anatomical scheme could have overlooked detailed patterns of connectivity, which may play a key role in neurological diseases investigation. In [16], we partially addressed these open issues by conducting a connectivity analysis on only the sub-cortical connectivity sub-network.

In the present work, we extend our previous analyses by comparing different state-of-the-art classification algorithms and, at the same time, by using a different parcellation scheme which takes into account the overall brain structural connectivity patterns. To this end, we developed a machine learning framework for both classification and feature analysis of AD based on communicability.

In Section 2, we describe the dataset used for the study. In Section 3, we outline the steps of the analysis: we describe the image processing pipeline to obtain the connectivity network starting from DWI scans, we explain the feature extraction step consisting in the calculation of the graph communicability for each node pairs, and we provide a description of the different machine learning algorithms used for the classification comparison. In Section 4, we report the results of the classification comparison and of the feature analysis, finding out the region pairs most related to the disease detection in accordance with the three classification methods. The results are discussed in the last section.

2. Materials

For the purposes of the present study, we used the data of a balanced cohort of 40 HC subjects and 40 age-matched AD patients, from the ADNI database, in order to compare different classifiers by avoiding the potential problem arising from unbalanced dataset. ADNI is multi-site, longitudinal research study that actively supports the research on medical treatments to slow or stop the progression of AD. The overall goal of the study is to validate candidate biomarkers for use in clinical treatment trials.

The diffusion-weighted scans were randomly selected from baseline and follow-up study visits. HC subjects do not show signs of depression, mild cognitive impairment or dementia; participants with AD meet the NINCDS/ADRDA criteria for probable AD. Demographics and clinical scores for the participants are summarized in Table 1. Scans were acquired by using a 3-T GE Medical Systems scanner; more precisely, 46 separate images were acquired for each scan: five with negligible diffusion effects (

b_{0}

images) and 41 diffusion-weighted images (

b = 1000

s/mm²). For each subject, the T1 weighted anatomical scan was also used to perform tractography.

3. Methods

The proposed framework includes several steps which are described in the following subsections. It is worth to note that these steps typically require a huge computational burden, with image processing time in particular of about ten hours per subject. In order to carry out such an expensive computation, we used the distributed infrastructure ReCaS-Bari computing farm (https://www.recas-bari.it/index.php/it/).

3.1. Image Preprocessing

For each subject, the DICOM images were acquired from the ADNI database, The dcm2nii tool, included in the MRIcron suite, was used to convert the DICOM images into the NIFTI format. The NIFTI images were then organized in the standard BIDS format. The other processing steps, from image preprocessing to connectome reconstruction, were carried out with tools provided within the MRtrix3 software package (http://www.mrtrix.org/) and the FMRIB Software Library (FSL) (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki).

The main steps of the whole processing, which are well-established in the literature, are shown in Figure 1 and they are outlined in the following. First, a denoising step was performed in order to enhance the signal-to-noise ratio of the diffusion weighted MR signals so as to reduce the thermal noise. This noise is due to the stochastic thermal motion of the water molecules and their interaction with the surrounding micro-structure [19]. Head motion and eddy current distortions were corrected by aligning the DWI images of each subject to the average

b_{0}

image. The brain extraction tool (BET) was then used for the skull-stripping of the brain [20]. The bias-field correction was used to correct all DWI volumes. Similarly, the T1 weighted scans were preprocessed by performing the following steps: reorientation to the standard image MNI152, automatic cropping, bias-field correction, registration to the linear and non-linear standard space, brain extraction. The next step was the inter-modal registration of the diffusion weighted and T1 weighted image.

After the preprocessing and co-registration steps, we performed the structural connectome generation. First, we generated a tissue-segmented image tailored to the anatomically constrained tractography [21]. Then, we performed an unsupervised estimation of WM, gray matter and cerebro-spinal fluid. In the next step, the fiber orientation distributions for spherical deconvolution was estimated [22]. We then performed a probabilistic tractography [23] using dynamic seeding [24] and anatomically-constrained tractography [25], which improves the tractography reconstruction by using anatomical information through a dynamic thresholding strategy. We applied the spherical-deconvolution informed filtering of tractograms (SIFT2) methodology [24], which not only provides more biologically meaningful estimates of the structural connection density, but also a more efficient solution to the streamlines connectivity quantification problem. The obtained streamlines were mapped through a T1 parcellation scheme by using the AAL2 atlas [26], which is a revised version of the automated anatomical atlas (AAL) including 120 regions. Finally, a robust structural connectome construction was performed for generating the connectivity matrices [27]. The pipeline here described has been used in recent structural connectivity studies, for example [28,29].

The final output of the image processing step was a

120 \times 120

weighted symmetric connectivity matrix for each subject: each entry corresponded to the fiber tracts connecting region i to region j. In contrast to our previous works, where only cortical or sub-cortical regions were considered disjointly, we here employed networks at the whole-brain level.

3.2. Feature Extraction

The connectivity matrix represents the structural complexity of the brain network. A powerful framework for mathematically treating such a complex system is graph theory. Several graph metrics can be computed from the connectivity matrix to describe the topological properties of the brain. Most of these measures are based on the shortest path connecting two nodes of the network. Their relevance rests on the idea that the communication between two nodes takes place through the shortest path connecting them. However, in many real-world networks, such as social networks, information does not necessarily flow along the shortest paths; moreover, it can go back and forward several times before reaching its final destination (e.g., [30,31]). As a consequence, relying only on shortest path-based models can lead to relevant information loss.

In order to overcome this drawback, Estrada and Hatano proposed a new concept of communicability, initially only for binary networks, defining the communicability between two nodes as a function of the total number of walks connecting them, giving more importance to the shorter than the longer ones [14].

Let G be a graph of N nodes and A the corresponding

N \times N

adjacency matrix, then:

{(A^{k})}_{p q} : = \sum_{r_{1} = 1}^{N} \sum_{r_{2} = 1}^{N} \dots \sum_{r_{k - 1} = 1}^{N} a_{p, r_{1}} a_{r_{1}, r_{2}} a_{r_{2}, r_{3}} \dots a_{r_{k - 2}, r_{k - 1}} a_{r_{k - 1}, r_{q}},

(1)

counts the number of walks of length k starting at node p and ending at node q. The communicability between p and q is given by the total number of walks connecting them, weighted in decreasing order of their lengths:

G_{p q} = \sum_{k = 0}^{\infty} \frac{{(A^{k})}_{p q}}{k!} = {(e^{A})}_{p q} .

(2)

This equation can also be rewritten in terms of the graph spectrum as:

G_{p q} = \sum_{j = 1}^{n} φ_{j} (p) φ_{j} (q) e^{λ_{j}},

(3)

where

φ_{j} (p)

is the p-th element of the j-th orthonormal eigenvector of the adjacency matrix associated with the eigenvalue

λ_{j}

.

The concept of communicability was extended to the weighted case by Crofts and Higham [32]. The definition provided above is still valid but A is the

N \times N

weighted matrix and the terms

a_{p, r_{1}} a_{r_{1}, r_{2}} a_{r_{2}, r_{3}} \dots a_{r_{k - 2}, r_{k - 1}} a_{r_{k - 1}, r_{q}}

represent the weights of the walks

i \mapsto r_{1}

,

r_{1} \mapsto r_{2}

, etc. In order to avoid the excessive influence of a node depending on its high weight, the authors proposed a normalization step which consists in dividing the weight

a_{i j}

by the product

\sqrt{s_{i} s_{j}}

, where

s_{i}

is the strength of node i. Therefore, the communicability between two nodes p and q in a weighted network is defined as:

G_{p q} = {(\exp (D^{- 1 / 2} A D^{- 1 / 2}))}_{p q},

(4)

where

D = diag (s_{i})

is the

N \times N

diagonal strength matrix.

3.3. Model Fitting

The main goal of our analysis was to compare the performance of different classification algorithms, on the same data set, for discriminating AD from HC through communicability. To this end, we employed three state-of-the-art classification models:

Support vector machines (SVMs);
Random forests (RFs);
Artificial neural networks (ANNs).

They are briefly described in the following.

3.3.1. Support Vector Machines

SVMs construct separating hyperplanes between the two classes so that the minimal distance from the closest data points of either classes is the largest [33]. Previously unseen examples are predicted to belong to a class based on which side of the hyperplane they fall. In order to mitigate the effects of overfitting, the margin of the hyperplane is chosen so as to correctly separate most of the training examples, while misclassifying some of them. To learn nonlinear decision boundaries, the data points can be mapped to a higher dimensional space via a kernel function: in the present work, we used a Gaussian radial basis function kernel. It is worth to note that the bias-variance trade-off of the algorithm is governed by the fine tuning of the penalty parameter C, which represents the violations the algorithm can tolerate when constructing the hyperplane, and the kernel coefficient

γ

. In this work, we set C to 1 and

γ

to

\frac{1}{n}

, where n is the number of features: this is a typical parameter setting.

3.3.2. Random Forests

RF is a tree-based method for classification which relies on the concept of bootstrap aggregating (or bagging) to build a multitude of decision trees at training time and outputting the mode of the classes predicted by the individual trees at test time [34]. Bagging consists in iteratively selecting a random sample with replacement from the training set and fitting a decision tree to this sample. In contrast to ordinary bagging, when building a decision tree RF does not consider the entire set of features at disposal but chooses random subsets of features. This serves to avoid growing highly correlated trees. In the present work, 500 trees were used to build the forest: this is a common choice.

3.3.3. Artificial Neural Networks

In this work, we used the classic Multi-layer Perceptron (MLP) architecture. Briefly speaking, an MLP is a feed-forward artificial artificial neural network that can learn a non-linear function approximator either for classification or regression [35]. In contrast to traditional logistic regression, which is based on a single weighted linear combination between the input layer and the output layer, an MLP provides one or more non-linear (hidden) layers. In the present work, we used an MLP with two hidden layers (32 hidden units each) and, as activation function, we used the commonly used ReLU. Employing more hidden layers would had a negative impact on classification performance, given the higher number of parameters to be optimized with respect to the number of examples. The network optimizes the log-loss function via backpropagation using the Limited-memory BFGS algorithm [36]. This is an optimization algorithm in the family of quasi-Newton methods which is known to perform well when, as in this case, the training data is small [37].

3.4. Feature Analysis

The supervised algorithms provided within our framework are naturally equipped with methods to assess the importance of the input features by computing weighted rankings:

Support vector machines: we used the popular recursive feature elimination (SVM-RFE) algorithm [38]. The method is based on criteria derived from the SVM model to assess feature importance and thus to remove features having small criteria. The process is computed iteratively until all features are removed from the feature set: the final output is a ranked feature list.
Random forests: for each tree, the feature importance was calculated as the decrease in node impurity weighted by the expected fraction of the samples reaching that node. For the overall forest, the normalized feature importance were simply summed.
Artificial neural networks: we used the well-known Gedeon method [39]: it computes a feature ranking by considering the weights connecting the input features to both the two hidden layers.

4. Experimental Results

In this section, the results of two analysis are reported. The first one was aimed at comparing the performance of the three classification models employed. The second one was a feature importance analysis aimed at evaluating the effectiveness of communicability in identifying the brain regions whose connectivity is more related to AD.

4.1. Classification Performance

In order to validate the classification performance, we used a 10-fold cross-validation. With this scheme the set of examples is divided into 10 folds of equal size: nine folds are used to train the learning algorithm, and the remaining fold is used to test it. The entire procedure is repeated 10 times, until every fold is used as test set once. Note that the splitting within each iteration was stratified by diagnosis so as to have approximately the same number of examples from each diagnostic group in each fold. The entire procedure was repeated ten times, with different permutations of the training and test examples, in order to obtain a better generalization of the performance.

The results obtained are expressed in terms of traditional performance metrics: accuracy, area under the ROC curve (AUC), sensitivity and specificity. In the following, the mean values are reported, averaged over all the cross-validation iterations. Also the standard errors are reported.

Figure 2 shows the classification performance of the three classifiers. Although quite comparable performance can be observed for all of them, it can be noticed that ANN slightly outperformed SVM and RF both in terms of accuracy (i.e.,

0.75 \pm 0.01

) and AUC (i.e.,

0.83 \pm 0.01

). Interestingly, SVM and RF provided a specificity higher than sensitivity; instead, with ANN this trend is reversed as a sensitivity higher than specificity was obtained (i.e.,

0.80 \pm 0.02

and

0.70 \pm 0.01

, respectively). The mean value of sensitivity for ANN was found to be statistically significant different from that obtained by the other two classifiers (p-value

< 0.0001

and p-value

= 0.0003

in the comparison with SVM and RF, respectively, with the Mann-Whitney U test). Concerning the other performance metrics, no statistically significant difference, at the significance level of

0.01

, was found between the three models.

Another important question arising concerns the agreement between the three classification models on the labels to be assigned to the test examples. In fact, in principle, two distinct models could perform similarly but misclassifying different examples. In order to evaluate the inter-annotator agreement, we calculated the well-known Cohen’s

κ

between the pair-wise predictions. This value ranges from

- 1

to

+ 1

: high values indicate good agreement; zero o lower values indicate chance agreement. We observed a

κ

of ∼

0.67

between ANN and the other two models and a

κ

of ∼

0.85

between RF and SVM.

4.2. Feature Importance

In the second part of our analysis, we evaluated which features had a special role in the disease prediction. To this end, we used the feature ranking methods described in Section 3.4. It is worth to note that, for a more robust evaluation, we computed the importance rankings over a hundred of different random sub-samples of the subjects having the same class distribution of the original sample.

We found 25 region pairs in common to the 90th percentile of the importance distribution of the three methods. The region pairs with the highest relative importance, in accordance with the AAL2 atlas, are depicted in Figure 3. Instead, Table 2 shows the seven regions which occurred more than one time in the 25 most important region pairs. Finally, Table 3 shows the anatomical areas whose regions occur more often. It is worth to note the importance of the Occipital Lobe and its regions, as well as the importance of the subcortical sub-network, particularly Caudate.

5. Discussion and Conclusions

In this paper, we developed a framework that simultaneously exploits complex network features and machine learning algorithms to investigate the information content of the communicability metric in the discrimination between pathological and controls and explore the most relevant AD-related brain regions at the whole brain level. In doing this, we extended our previous research [15,16] by addressing two major open issues. As first important issue, since in [15] the goal was to compare communicability to other traditional network metrics, we used only one classification algorithm. In this work, we employed three different state-of-the-art classification algorithms in order to evaluate if the information content of communicability is robust against the use of a particular learning algorithm. We found that all the models employed, i.e., SVMs, RFs and ANNs, provided comparable values of accuracy, AUC and specificity, which are in line with our previous work. Instead, for what concerns sensitivity, a significantly higher value was obtained with ANN and thus we can conclude that this classification model is more sensitive to detect the disease starting from the features used in this analysis.

There is usually a trade-off between sensitivity and specificity. A sensitivity higher than specificity is preferable in diagnosis support systems, as this means that the system is better in detecting the presence of disease in the pathological group rather than detecting the absence of disease in the healthy group. Therefore, the system is more effective for ruling out disease when resulting in a negative response. In addition, it is worth to remark that the high value of sensitivity has been here obtained with a perfectly balanced dataset. Having balanced data provides more reliable evaluation of the performance, as it is well known that, in case of unbalanced data, classification algorithms tend to favor the correct prediction of the over-represented class instead of the other ones. However, it is worth noting that in a clinical context achieving a high sensitivity is of paramount importance, so it could be appropriate to explore also situations with unbalanced datasets or adopt strategies that alleviate the risk of overlooking the diagnosis [40].

The second important issue concerns the dependence of the proposed method on the network size. In our previous work [15], coarse anatomical connectivities were considered for the estimated brain networks. Connectivities obtained between relatively large region patterns are more robust and reproducible; however, they could overlook detailed patterns of connectivity, which may play a key role in neurological diseases investigation.This issue was partially addressed in [16], where we focused only on the subcortical sub-network. In this paper, we further extended our previous studies by taking into account a different parcellation scheme and a different reconstruction of the brain connectivity network to study patterns of connectivity emerging at the whole-brain level.

First of all, we observed that the most significant brain region pairs involve connections between cortical regions or between subcortical regions and cortical regions. No significant connections between subcortical regions were found, confirming our previous results about the hypothesis that AD connectivity alterations mostly regard the inter-connectivity between subcortical and cortical regions rather than the intra-subcortical connectivity. Among the most significant different connections we found the Temporal-Frontal and Temporal-Parietal which are known to be affected in AD [41,42]. The anatomical areas most occurring in the most significant brain region pairs are Occipital, Parietal and Temporal Lobe, which are highly AD-related brain regions [43,44,45]. Indeed, the Occipital area is responsible for visual processing, the Parietal area has an important role in integrating senses while the Temporal area is essential for memory. The Cerebellum has also been found among the most occurring regions. The role of the Cerebellum in AD has been a topic of debate in the last years: only recently its closely association with cognitive deterioration has emerged, e.g., [46,47]. Three subcortical regions were found among the most significantly different node pairs: right Caudate, left and right Hippocampus and Putamen. Also these regions play an important role in AD [48,49,50]. It is interesting to note that there are some significantly region pairs with greater communicability in AD compared to HC. Similar findings have been reported for example in [51] where some areas of greater communicability in stroke patients compared to controls were found also in the lesioned hemisphere, also in this case starting from DWI date for studying a disconnection syndrome. One possible interpretation of this pattern is the hypothesis that the increased communicability in AD could reflect adaptive changes in the white matter structure that have occurred secondary to the disease. We also compared the most occurring regions resulting from this study to our previous findings [15,16]. In particular, we found a significant overlap with some cortical and sub-cortical anatomical regions: parietal lobe, paracentral gyrus and temporal areas are the most overlapping regions. In addition, we found a consistent overlap in both parahippocampal and caudal regions.

Another important issue concerns the agreement between the three classification algorithms. Although the classification models are based on different algorithms, they showed very similar findings in Accuracy, AUC and Specificity and in the detection of the most significant features. The results attested the robustness of the framework based on whole brain graph communicability.

Other main open issues should be addressed in future work. In this paper, only the binary discrimination HC/AD was taken into account: the classification task involving mild cognitive impairment (MCI) subjects should also be considered in order to support the disease diagnosis at earlier stages. However, it should be noted that the AD patients we have taken into account for the present study, from the ADNI database, are characterized by a MMSE score indicating a moderate and not severe cognitive impairment.

In addition, a greater sample size should be used and other classification strategies should be explored to further improve the diagnostic accuracy. Novel insights could be obtained, for example, by using a multi-expert approach or a set of fuzzy inference rules. Promising results from using these techniques for diagnostic purposes in other domains have been recently reported, e.g., [52,53]. A multi-expert approach, which is based on ensembling different classifiers trained for the same task, may further improve prediction accuracy, as it can benefit from the different viewpoints from which these classifiers may look at the data. Concerning the use of fuzzy inference rules, they can be extracted from data with the aim to provide more explicit classification rules and thus to better support the decisions made by physicians. The use of Fuzzy Logic revealed beneficial in brain MRI. As a matter of fact, Fuzzy C-Means clustering was effectively applied also to MR image segmentation in neuroimaging for neurodegenerative diseases [54,55], as well in oncology [56]. Future work would address these issues.

Author Contributions

Conceptualization, E.L., A.L. and S.T.; data curation, E.L.; investigation, E.L. and A.L.; software, E.L.; writing—original draft preparation, E.L.; methodology, E.L., A.L., S.T. and R.B.; formal analysis E.L. and A.L.; visualization, E.L. and A.L.; resources, D.D.; supervision, R.B. and S.T.; validation, all authors; writing—review and editing, all authors. All authors have read and agreed to the published version of the manuscript.

Funding

This paper has been partially supported by the Apulian regional INNONETWORK project, project code BNLGWP7.

Acknowledgments

Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributionristol-Myers Squibb Company; CereSpir, Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

Conflicts of Interest

The authors declare no conflict of interest.

References

Alzheimer’s Association. 2018 Alzheimer’s disease facts and figures. Alzheimer’s Dement. 2018, 14, 367–429. [Google Scholar] [CrossRef]
Rose, S.E.; Chen, F.; Chalk, J.B.; Zelaya, F.O.; Strugnell, W.E.; Benson, M.; Semple, J.; Doddrell, D.M. Loss of connectivity in Alzheimer’s disease: An evaluation of white matter tract integrity with colour coded MR diffusion tensor imaging. J. Neurol. Neurosurg. Psychiatry 2000, 69, 528–530. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lo, C.Y.; Wang, P.N.; Chou, K.H.; Wang, J.; He, Y.; Lin, C.P. Diffusion tensor tractography reveals abnormal topological organization in structural cortical networks in Alzheimer’s disease. J. Neurosci. 2010, 30, 16876–16885. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hoy, A.R.; Ly, M.; Carlsson, C.M.; Okonkwo, O.C.; Zetterberg, H.; Blennow, K.; Sager, M.A.; Asthana, S.; Johnson, S.C.; Alexander, A.L.; et al. Microstructural white matter alterations in preclinical Alzheimer’s disease detected using free water elimination diffusion tensor imaging. PLoS ONE 2017, 12, e0173982. [Google Scholar] [CrossRef] [Green Version]
Maggipinto, T.; Bellotti, R.; Amoroso, N.; Diacono, D.; Donvito, G.; Lella, E.; Monaco, A.; Scelsi, M.A.; Tangaro, S. DTI measurements for Alzheimer’s classification. Phys. Med. Biol. 2017, 62, 2361. [Google Scholar] [CrossRef]
Dyrba, M.; Grothe, M.; Kirste, T.; Teipel, S.J. Multimodal analysis of functional and structural disconnection in Alzheimer’s disease using multiple kernel SVM. Hum. Brain Mapp. 2015, 36, 2118–2131. [Google Scholar] [CrossRef]
Basser, P.J.; Pajevic, S.; Pierpaoli, C.; Duda, J.; Aldroubi, A. In vivo fiber tractography using DT-MRI data. Magn. Reson. Med. 2000, 44, 625–632. [Google Scholar] [CrossRef]
Bullmore, E.; Sporns, O. Complex brain networks: Graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 2009, 10, 186. [Google Scholar] [CrossRef]
Schouten, T.M.; Koini, M.; de Vos, F.; Seiler, S.; de Rooij, M.; Lechner, A.; Schmidt, R.; van den Heuvel, M.; van der Grond, J.; Rombouts, S.A. Individual classification of Alzheimer’s disease with diffusion magnetic resonance imaging. Neuroimage 2017, 152, 476–481. [Google Scholar] [CrossRef] [Green Version]
Lombardi, A.; Tangaro, S.; Bellotti, R.; Bertolino, A.; Blasi, G.; Pergola, G.; Taurisano, P.; Guaragnella, C. A novel synchronization-based approach for functional connectivity analysis. Complexity 2017, 2017, 7190758. [Google Scholar] [CrossRef] [Green Version]
Lombardi, A.; Guaragnella, C.; Amoroso, N.; Monaco, A.; Fazio, L.; Taurisano, P.; Pergola, G.; Blasi, G.; Bertolino, A.; Bellotti, R.; et al. Modelling cognitive loads in schizophrenia by means of new functional dynamic indexes. NeuroImage 2019, 195, 150–164. [Google Scholar] [CrossRef] [PubMed]
Jo, T.; Nho, K.; Saykin, A.J. Deep Learning in Alzheimer’s Disease: Diagnostic Classification and Prognostic Prediction Using Neuroimaging Data. Front. Aging Neurosci. 2019, 11, 220. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Amoroso, N.; Diacono, D.; Fanizzi, A.; La Rocca, M.; Monaco, A.; Lombardi, A.; Guaragnella, C.; Bellotti, R.; Tangaro, S.; Initiative, A.D.N.; et al. Deep learning reveals Alzheimer’s disease onset in MCI subjects: Results from an international challenge. J. Neurosci. Methods 2018, 302, 3–9. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Estrada, E.; Hatano, N. Communicability in complex networks. Phys. Rev. E 2008, 77, 036111. [Google Scholar] [CrossRef] [Green Version]
Lella, E.; Amoroso, N.; Lombardi, A.; Maggipinto, T.; Tangaro, S.; Bellotti, R. Communicability disruption in Alzheimer’s disease connectivity networks. J. Complex Netw. 2018, 7, 83–100. [Google Scholar] [CrossRef]
Lella, E.; Amoroso, N.; Diacono, D.; Lombardi, A.; Maggipinto, T.; Monaco, A.; Bellotti, R.; Tangaro, S. Communicability Characterization of Structural DWI Subcortical Networks in Alzheimer’s Disease. Entropy 2019, 21, 475. [Google Scholar] [CrossRef] [Green Version]
Rosen, W.G.; Mohs, R.C.; Davis, K.L. A new rating scale for Alzheimer’s disease. Am. J. Psychiatry 1984, 141, 1356–1364. [Google Scholar]
Mohs, R.C.; Knopman, D.; Petersen, R.C.; Ferris, S.H.; Ernesto, C.; Grundman, M.; Sano, M.; Bieliauskas, L.; Geldmacher, D.; Clark, C.; et al. Development of cognitive instruments for use in clinical trials of antidementia drugs: Additions to the Alzheimer’s Disease Assessment Scale that broaden its scope. Alzheimer Dis. Assoc. Disord. 1997, 11 (Suppl. 2), S13–S21. [Google Scholar] [CrossRef]
Veraart, J.; Novikov, D.S.; Christiaens, D.; Ades-Aron, B.; Sijbers, J.; Fieremans, E. Denoising of diffusion MRI using random matrix theory. NeuroImage 2016, 142, 394–406. [Google Scholar] [CrossRef] [Green Version]
Smith, S.M. Fast robust automated brain extraction. Hum. Brain Mapp. 2002, 17, 143–155. [Google Scholar] [CrossRef]
Zhang, Y.; Brady, M.; Smith, S. Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans. Med. Imaging 2001, 20, 45–57. [Google Scholar] [CrossRef]
Jeurissen, B.; Tournier, J.D.; Dhollander, T.; Connelly, A.; Sijbers, J. Multi-tissue constrained spherical deconvolution for improved analysis of multi-shell diffusion MRI data. NeuroImage 2014, 103, 411–426. [Google Scholar] [CrossRef]
Tournier, J.D.; Calamante, F.; Connelly, A. Improved probabilistic streamlines tractography by 2nd order integration over fibre orientation distributions. Proc. Int. Soc. Magn. Reson. Med. 2010, 18, 1670. [Google Scholar]
Smith, R.E.; Tournier, J.D.; Calamante, F.; Connelly, A. SIFT2: Enabling dense quantitative assessment of brain white matter connectivity using streamlines tractography. Neuroimage 2015, 119, 338–351. [Google Scholar] [CrossRef] [PubMed]
Smith, R.E.; Tournier, J.D.; Calamante, F.; Connelly, A. Anatomically-constrained tractography: Improved diffusion MRI streamlines tractography through effective use of anatomical information. Neuroimage 2012, 62, 1924–1938. [Google Scholar] [CrossRef] [PubMed]
Rolls, E.T.; Joliot, M.; Tzourio-Mazoyer, N. Implementation of a new parcellation of the orbitofrontal cortex in the automated anatomical labeling atlas. Neuroimage 2015, 122, 1–5. [Google Scholar] [CrossRef]
Smith, R.E.; Tournier, J.D.; Calamante, F.; Connelly, A. The effects of SIFT on the reproducibility and biological accuracy of the structural connectome. Neuroimage 2015, 104, 253–265. [Google Scholar] [CrossRef]
Amico, E.; Goñi, J. Mapping hybrid functional-structural connectivity traits in the human connectome. Netw. Neurosci. 2018, 2, 306–322. [Google Scholar] [CrossRef]
Tipnis, U.; Amico, E.; Ventresca, M.; Goni, J. Modeling communication processes in the human connectome through cooperative learning. IEEE Trans. Netw. Sci. Eng. 2018. [Google Scholar] [CrossRef] [Green Version]
Borgatti, S.P. Centrality and network flow. Social Netw. 2005, 27, 55–71. [Google Scholar] [CrossRef]
Hromkovič, J.; Klasing, R.; Pelc, A.; Ruzicka, P.; Unger, W. Dissemination of Information in Communication Networks: Broadcasting, Gossiping, Leader Election, and Fault-Tolerance; Springer Science & Business Media: Dordrecht, The Netherlands, 2005. [Google Scholar]
Crofts, J.J.; Higham, D.J. A weighted communicability measure applied to complex brain networks. J. R. Soc. Interface 2009, 6, 411–414. [Google Scholar] [CrossRef] [PubMed] [Green Version]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer: New York, NY, USA, 2013; Volume 112. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Murphy, K.P. Machine Learning: A Probabilistic Perspective; Adaptive Computation and Machine Learning Series; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Liu, D.C.; Nocedal, J. On the limited memory BFGS method for large scale optimization. Math. Programm. 1989, 45, 503–528. [Google Scholar] [CrossRef] [Green Version]
Morales, J.L.; Nocedal, J. Remark on “algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization”. ACM Trans. Math. Softw. 2011, 38, 1–4. [Google Scholar] [CrossRef]
Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V. Gene selection for cancer classification using support vector machines. Mach. Learn. 2002, 46, 389–422. [Google Scholar] [CrossRef]
Gedeon, T.D. Data mining of inputs: Analysing magnitude and functional measures. Int. J. Neural Syst. 1997, 8, 209–218. [Google Scholar] [CrossRef] [PubMed]
Han, C.; Murao, K.; Noguchi, T.; Kawata, Y.; Uchiyama, F.; Rundo, L.; Nakayama, H.; Satoh, S. Learning more with less: Conditional PGGAN-based data augmentation for brain metastases detection using highly-rough annotation on MR images. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM), Beijing, China, 3–7 November 2019; pp. 119–127. [Google Scholar]
Babiloni, C.; Lizio, R.; Marzano, N.; Capotosto, P.; Soricelli, A.; Triggiani, A.I.; Cordone, S.; Gesualdo, L.; Del Percio, C. Brain neural synchronization and functional coupling in Alzheimer’s disease as revealed by resting state EEG rhythms. Int. J. Psychophysiol. 2016, 103, 88–102. [Google Scholar] [CrossRef]
Koelewijn, L.; Bompas, A.; Tales, A.; Brookes, M.J.; Muthukumaraswamy, S.D.; Bayer, A.; Singh, K.D. Alzheimer’s disease disrupts alpha and beta-band resting-state oscillatory network connectivity. Clin. Neurophysiol. 2017, 128, 2347–2357. [Google Scholar] [CrossRef]
Naggara, O.; Oppenheim, C.; Rieu, D.; Raoux, N.; Rodrigo, S.; Dalla Barba, G.; Meder, J.F. Diffusion tensor imaging in early Alzheimer’s disease. Psychiatry Res. Neuroimaging 2006, 146, 243–249. [Google Scholar] [CrossRef]
Arnold, S.E.; Hyman, B.T.; Flory, J.; Damasio, A.R.; Van Hoesen, G.W. The topographical and neuroanatomical distribution of neurofibrillary tangles and neuritic plaques in the cerebral cortex of patients with Alzheimer’s disease. Cereb. Cortex 1991, 1, 103–116. [Google Scholar] [CrossRef]
Braak, H.; Braak, E.; Kalus, P. Alzheimer’s disease: Areal and laminar pathology in the occipital isocortex. Acta Neuropathol. 1989, 77, 494–506. [Google Scholar] [CrossRef]
Jacobs, H.I.; Hopkins, D.A.; Mayrhofer, H.C.; Bruner, E.; van Leeuwen, F.W.; Raaijmakers, W.; Schmahmann, J.D. The cerebellum in Alzheimer’s disease: Evaluating its role in cognitive decline. Brain 2017, 141, 37–47. [Google Scholar] [CrossRef] [PubMed]
Zheng, W.; Liu, X.; Song, H.; Li, K.; Wang, Z. Altered functional connectivity of cognitive-related cerebellar subregions in Alzheimer’s disease. Front. Aging Neurosci. 2017, 9, 143. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ryan, N.S.; Keihaninejad, S.; Shakespeare, T.J.; Lehmann, M.; Crutch, S.J.; Malone, I.B.; Thornton, J.S.; Mancini, L.; Hyare, H.; Yousry, T.; et al. Magnetic resonance imaging evidence for presymptomatic change in thalamus and caudate in familial Alzheimer’s disease. Brain 2013, 136, 1399–1414. [Google Scholar] [CrossRef] [PubMed]
Allen, G.; Barnard, H.; McColl, R.; Hester, A.L.; Fields, J.A.; Weiner, M.F.; Ringe, W.K.; Lipton, A.M.; Brooker, M.; McDonald, E.; et al. Reduced hippocampal functional connectivity in Alzheimer disease. Arch. Neurol. 2007, 64, 1482–1487. [Google Scholar] [CrossRef] [Green Version]
De Jong, L.W.; van der Hiele, K.; Veer, I.M.; Houwing, J.; Westendorp, R.; Bollen, E.; de Bruin, P.W.; Middelkoop, H.; van Buchem, M.A.; van der Grond, J. Strongly reduced volumes of putamen and thalamus in Alzheimer’s disease: An MRI study. Brain 2008, 131, 3277–3285. [Google Scholar] [CrossRef] [Green Version]
Crofts, J.J.; Higham, D.J.; Bosnell, R.; Jbabdi, S.; Matthews, P.M.; Behrens, T.; Johansen-Berg, H. Network analysis detects changes in the contralesional hemisphere following stroke. Neuroimage 2011, 54, 161–169. [Google Scholar] [CrossRef] [Green Version]
Angelillo, M.T.; Balducci, F.; Impedovo, D.; Pirlo, G.; Vessio, G. Attentional Pattern Classification for Automatic Dementia Detection. IEEE Access 2019, 7, 57706–57716. [Google Scholar] [CrossRef]
Casalino, G.; Castellano, G.; Pasquadibisceglie, V.; Zaza, G. Contact-Less Real-Time Monitoring of Cardiovascular Risk Using Video Imaging and Fuzzy Inference Rules. Information 2019, 10, 9. [Google Scholar] [CrossRef] [Green Version]
Kumar, D.; Verma, H.; Mehra, A.; Agrawal, R. A modified intuitionistic fuzzy c-means clustering approach to segment human brain MRI image. Multimed. Tools Appl. 2019, 78, 12663–12687. [Google Scholar] [CrossRef]
Tangaro, S.; Fanizzi, A.; Amoroso, N.; Bellotti, R.; Alzheimer’s Disease Neuroimaging Initiative. A fuzzy-based system reveals Alzheimer’s disease onset in subjects with Mild Cognitive Impairment. Phys. Med. 2017, 38, 36–44. [Google Scholar] [CrossRef] [PubMed]
Militello, C.; Rundo, L.; Vitabile, S.; Russo, G.; Pisciotta, P.; Marletta, F.; Ippolito, M.; D’arrigo, C.; Midiri, M.; Gilardi, M.C. Gamma Knife treatment planning: MR brain tumor segmentation and volume measurement based on unsupervised Fuzzy C-Means clustering. Int. J. Imaging Syst. Technol. 2015, 25, 213–225. [Google Scholar] [CrossRef]

Figure 1. Image processing pipeline adapted from [16].

Figure 2. Classification performance.

Figure 3. Glass brain visualization of the mean communicability difference between HC and AD for the region pairs with highest importance. The edge color and the edge thickness are descriptive of values.

Table 1. Demographic and clinical characteristics of the study participants. For the clinical assessment, the mini-mental state examination test (MMSE), Alzheimer’s disease assessment scale (ADAS) 11 [17] and ADAS 13 [18] scores are reported. According to the t-test statistics, MMSE, ADAS 11 and ADAS 13 are significantly different between healthy controls (HC) and Alzheimer’s disease (AD). For age and gender, the chi-squared test was performed.

	HC (40)	AD (40)	p-Value
Age	$73.06 \pm 5.54$	$75.8 \pm 9.16$	0.11
Gender	21 M/19 F	25 M/15 F	0.18
MMSE	$29.14 \pm 1.02$	$23.38 \pm 1.88$	<0.0001
ADAS 11	$5.42 \pm 3.14$	$20.85 \pm 9.16$	<0.0001
ADAS 13	$8.78 \pm 4.75$	$31.01 \pm 8.23$	<0.0001

Table 2. Most occurring regions.

Region	Percentage of Occurrence
Middle Occipital Gyrus (right)	0.16
Caudate (right)	0.12
Inferior Parietal Gyrus	0.12
Superior Temporal Gyrus (right)	0.08
Paracentral Lobule (left)	0.08
Postcentral Gyrus (right)	0.08
Rolandic Operculum (left)	0.08

Table 3. Most occurring anatomical areas.

Anatomical Area	Percentage of Occurrence
Occipital Lobe	0.24
Parietal Lobe	0.24
Temporal Lobe	0.20
Sub Cortical Grey Nuclei	0.16
Cerebellum	0.12
Medial Surface	0.12

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lella, E.; Lombardi, A.; Amoroso, N.; Diacono, D.; Maggipinto, T.; Monaco, A.; Bellotti, R.; Tangaro, S. Machine Learning and DWI Brain Communicability Networks for Alzheimer’s Disease Detection. Appl. Sci. 2020, 10, 934. https://doi.org/10.3390/app10030934

AMA Style

Lella E, Lombardi A, Amoroso N, Diacono D, Maggipinto T, Monaco A, Bellotti R, Tangaro S. Machine Learning and DWI Brain Communicability Networks for Alzheimer’s Disease Detection. Applied Sciences. 2020; 10(3):934. https://doi.org/10.3390/app10030934

Chicago/Turabian Style

Lella, Eufemia, Angela Lombardi, Nicola Amoroso, Domenico Diacono, Tommaso Maggipinto, Alfonso Monaco, Roberto Bellotti, and Sabina Tangaro. 2020. "Machine Learning and DWI Brain Communicability Networks for Alzheimer’s Disease Detection" Applied Sciences 10, no. 3: 934. https://doi.org/10.3390/app10030934

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning and DWI Brain Communicability Networks for Alzheimer’s Disease Detection

Abstract

1. Introduction

2. Materials

3. Methods

3.1. Image Preprocessing

3.2. Feature Extraction

3.3. Model Fitting

3.3.1. Support Vector Machines

3.3.2. Random Forests

3.3.3. Artificial Neural Networks

3.4. Feature Analysis

4. Experimental Results

4.1. Classification Performance

4.2. Feature Importance

5. Discussion and Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI