Introduction

In recent years, Artificial Intelligence (AI) techniques have been applied to the diagnosis of many nosological entities by means of data derived from radiological and nuclear medicine modalities [1,2,3,4,5,6,7,8,9,10].

AI is an extremely active research area based on computer programs able to mimic human activities, such as decision-making, learning, processing and understanding natural language and images [3, 10]. Machine learning (ML) is a branch of AI aiming at designing systems capable of automatically learning and improving from training data without being programmed explicitly. Machine learning methods play a major role in computer-assisted diagnosis, where they can suggest a diagnosis on an unknown case based on a number of previously classified (labelled) cases [4, 10]. Among the ML algorithms most commonly used in medical applications are Artificial Neural Networks (ANN), Classification Trees (ClT), Random Forest (RF), Support Vector Machines (SVM) and, more recently, Deep Learning (DL) [10, 11]. The background of the above-mentioned AI techniques is reported in details in a previous paper [10]. In particular, DL has been receiving increasing attention in medical applications during the last few years. Thanks to the sophisticated image representation and filtering, convolutional neural networks (CNN) can provide robustness towards input variability caused by changes in hardware and software settings. However, methods based on CNN tend to be computationally expensive at the training step (due to the large number on tuning parameters in the inner layers); furthermore, they require a large number of training samples to provide satisfactory generalization performance.

The use of AI techniques in the diagnostic process of neurodegenerative diseases as for instance dementia and movement disorders, represents a promising approach [10].

In the last years, the diagnosis of Parkinson’s disease (PD) and Parkinsonian syndromes versus Essential Tremor (ET) was considerably supported by the increasing role of SPECT scan with presynaptic radiopharmaceuticals, such as the widely diffuse 123I‐2β‐carbomethoxy‐3β‐4‐iodophenyl‐N‐3‐fluoropropyl nortropane (123I-FP-CIT) [12,13,14,15,16].

Furthermore, post-synaptic tracers or myocardial scintigraphy with 123I-metaiodobenzylguanidine (123I-MIBG) significantly contribute to the differential diagnosis among PD and other categories of Parkinsonism [17,18,19,20].

A relevant role in the differential diagnosis of PD and atypical Parkinsonian syndromes associated with dementia, i.e. multiple-system atrophy, progressive supranuclear palsy and corticobasal degeneration, is played also by 18Fluoro-deoxy-glucose (18FDG) PET, as metabolic patterns of regional glucose metabolism of these nosological entities are different and disease-specific [21, 22]. Furthermore patients with PD dementia (PDD) present more severe metabolic deficits in the parietal and frontal regions comparing with PD patients without cognitive impairment, while metabolic patterns in PDD patients and patients with Lewy body dementia were broadly similar [22].

Many previous works dealing with the contribution of nuclear medicine modalities in movement disorders used AI techniques to investigate data extracted from SPECT and PET scans to automatically classify the pathological groups [3, 10, 14, 15, 2325].

The objective of this review is to evaluate the most relevant papers on this topic and investigate the contribution of AI techniques to improve the diagnostic process of movement disorders.

Materials and methods

A selection of papers available on PubMed, Scopus and Web of Sciences was made. The search was based on the following key words: artificial intelligence, machine learning, neural network, deep learning, Parkinson, movement disorders, SPECT and PET. If “nuclear medicine techniques” is also added as key word, the number of papers obtained is lower and they are all included in the results provided by ruling the previous key word. Table 1 shows the number of papers available on PubMed (as it includes the papers with a higher clinical impact) with the key words used. Since not many articles are available on this topic (see Table 1), the choice of the papers described in this expert review was based on the international relevance, interest for the clinical practice, date (most recent) and impact (most cited). Articles not written in English were not considered.

Table 1 Number of papers available in PubMed with the key-words used on the topic considered in the review

Results

Early diagnosis of Parkinson’s disease

Early diagnosis of PD is crucial for a prompt therapeutic strategy and to avoid disease progression. Brain SPECT through 123I-FP-CIT, a pre-synaptic radiopharmaceutical of the dopaminergic transporters (DAT), proved able to provide a significant contribution to the differential diagnosis of early PD and non-Parkinsonian syndromes (i.e. ET) [12,13,14,15,16].

Brain SPECT with pre-synaptic radiopharmaceutical

To our knowledge, the first papers describing the role of artificial neural network classifiers to identify PD appeared in 2006.

Acton et al. [26] investigated 81 patients with PD and 94 healthy control subjects through SPECT scan with the dopamine transporter tracer 2-[[2-[[[3-(4-chlorophenyl)-8-methyl-8-azabicyclo[3,2,1]oct-2-yl]methyl](2-mercaptoethyl)amino]ethyl]amino]ethanethiolato(3-)-N2,N2′,S2,S2′]oxo-[1R-(exo-exo)]-(99m)Tc-technetium (99mTc-TRODAT-1). The striatum and the striatal pixel values extracted from the SPECT images were used as inputs to a three-layer ANN and the same data set was used to train and test the ANN in a ‘leave-one-out’ procedure. The classifier achieved diagnostic accuracy higher than any previous analysis method applied to the same data (overall accuracy 94.4%, specificity 97.5% and sensitivity 91.4%). However, the authors concluded that, as with all applications of an ANN, it was difficult to interpret precisely what triggers in the images were being detected by the network. This paper, although very interesting, has the limit that the radiopharmaceutical used was the scarcely available 99m-Tc-TRODAT-1.

In the same period, Hamilton [27] carried out a study to differentiate PD and Essential Tremor using ANN classification of quantified data extracted from 123I-FP-CIT brain SPECT images. In particular, they evaluated if a two-stage analysis could differentiate between various stages of Parkinsonian syndromes and essential tremor and if this kind of analysis could be undertaken in a single step using ANN. In the first step, the striatum-to-occipital cortex ratio was assessed to investigate non-early Parkinsonian syndromes and patients with a low ratio would be stopped at this step; in the second step, the putamen-to-caudate nucleus ratio was measured. Finally, the two-stage analysis was undertaken and repeated, in a single step, using an ANN. The authors determined that the two-stage analysis was less effective than the single-step process, and that the ANN could clearly discriminate between Parkinsonian syndromes and ET in all subjects studied without equivocal results. One limit of this study, however, was the small sample size (n = 18) and the unbalanced dataset (13 PD and 5 ET). Another limit was the “gold standard” used, which was the judgement of one single expert. Despite these limitations, this work is relevant in that it represents the first paper describing the use of an ANN on brain SPECT quantitative data with the most widely used radiopharmaceutical 123I-FP-CIT to discriminate PD and ET.

In a paper of Palumbo et al. [14], the authors investigated differential diagnosis between PD and ET from 123I-FP-CIT SPECT semi-quantitative data. To this end, they compared the performance of two different machine learning techniques: a probabilistic neural network (PNN) and a classification tree. The study included 216 patients of which 89 ET, 64 early PD with a Hoehn and Yahr (H&Y) score of ≤ 2 and 63 advanced PD with H&Y score ≥ 2 0.5. The dataset was randomly split into train and test set of equal size (50% of the study population each) and the fraction of the test set correctly classified was computed. With PNN, the probability of correct classification was 81.9 ± 8.1% (mean ± SD) in patients with early PD, 78.9 ± 8.1% in patients with advanced PD and 96.6 ± 2.6% in patients with ET. With ClT, the first decision rule gave a mean value for the putamen of 5.99, with a probability of correct classification of 93.5 ± 3.4%. This means that patients with putamen values > 5.99 were classified as ET subjects, while patients with putamen values < 5.99 were classified as PD patients. The discrimination between the two classes of PD was possible by examining the caudate nucleus values. If the caudate nucleus value was higher than 6.97, patients were included in the early PD group (probability 69.8 ± 5.3%), while if the value was < 6.97, patients were classified as advanced PD subjects (probability 88.1% ± 8.8%).

This study showed that valid classification results could be obtained using both PNN and ClT classifiers, but one clear advantage of ClT is that it provides clinically-interpretable cut-off values to differentiate between ET and PD of different severities. Comparing the overall results for the classification of patients with early and advanced PD, none of the classifiers emerged as clearly superior. Considering only patients with early PD, ClT seemed to have better discrimination capability than PNN, but the trend was reversed when patients with advanced PD, ClT were included. It is to be noted that the data analyzed in this paper were extracted by a semi-quantitative method based on the evaluation of the striatal uptake as performed considering specific/nonspecific putamen/occipital (p/o) and caudate/occipital (c/o) binding ratios using a standard ROI template constructed manually according to a stereotactic atlas and including fixed regions for both c/o ratio and the p/o ratio applied to three different representative slices.

More recent methods to semi-quantify 123I-FP-CIT uptake are based on volumetric ROIs. The freely available Basal Ganglia V2 software is widely diffuse and it is based on a high-definition, 3-D striatal template, derived from Talairach atlas [28, 29]. The software provides automatic, 3-D segmentation of caudate and putamen in each hemisphere and an optimization protocol automatically adjusts the positioning of blurred templates to best match the radioactive counts and places an occipital region of interest (ROI) for background evaluation. Putamen and caudate nucleus binding is subtracted by background binding to measure the specific to non-displaceable binding ratio (SBR) in caudate nucleus and putamen in each hemisphere.

To evaluate the diagnostic performance of 123I-FP-CIT brain SPECT with semi-quantitative analysis by Basal Ganglia V2 software, Palumbo et al. [15] investigated semi-quantitative data of patients with suspect of PD by a support vector machine (SVM) classifier. An SVM implements supervised learning to analyze data and recognize patterns. It is commonly employed for classification and regression analysis. The SVM algorithm computes the class separation boundaries to maximize the distance between the example points belonging to different classes [30]. Support vector classifiers usually show good generalization capability, therefore allowing to classify also of patients not used in the training phase [31]. In the paper, [15] 123I-FP-CIT SPECT was performed in 90 suspected PD patients having mild symptoms (bradykinesia-rigidity and mild tremor). Among them, PD was confirmed in 56 patients, while 34 resulted non-PD (essential tremor and drug-induced Parkinsonism). To investigate the performance of the SVM models implemented in Basal Ganglia V2, the authors used different feature sets and estimated the classification accuracy via “leave-one-out” and a “fivefold” cross-validation. The first feature set was composed of semi-quantitative radiopharmaceutical uptake values in the left and right putamen (PL, PR), and in the left and right caudate nucleus (CL,CR) for a total of four features—i.e., CL, CR, PL and PR. The second feature set considered CL and CR only, whereas the third PL and PR only. Age was added as a further descriptor to evaluate its influence in the classification performance.

A classification performance higher than 73.9% was obtained with all the models. For the “Leave-one-out” method, PL and PR were better predictors (accuracy of 91% for all patients) than CL and CR descriptors, using PL, PR, CL, and CR diagnostic accuracy was similar to that of PL and PR descriptors in the different groups. Adding age as a feature increased accuracy in all the models. The best results were obtained using all the five descriptors both in PD and non-PD subjects (CR and CL + PR and PL + age = 96.4% and 94.1%, respectively). Similar results were observed for the “fivefold” method. In conclusion, 123I-FP-CIT SPECT with BasGan analysis using SVM classifier was able to diagnose PD. Putamen resulted the most discriminative descriptor for PD and the patient age influenced the classification accuracy. The limit of this study was the absence of healthy control subjects, as patients with mild symptoms (bradykinesia-rigidity and mild tremor) were examined for clinical purposes to confirm or exclude PD. Therefore, the patients studied were considered as having PD or not (non-PD group), but a further normal group of subjects was not available.

Prashanth et al. [32] proposed SVM to classify 123I-FP-CIT brain SPECT data of patients with Parkinsonian syndromes. The authors analysed data from the Parkinson progression marker initiative (PPMI) database [33], a landmark and first large-scale, comprehensive, observational, international, multi-center study to identify PD progression biomarkers. The imaging features were Striatal Binding Ratio (SBR) values of the four striatal regions (left and right caudate, the left and right putamen), which were computed from 123I-FP-CIT SPECT images and were available from the PPMI database. The authors showed that SVM provided valuable results to correctly classify PD versus normal subjects. The strengths of this interesting study are the large population (369 early PD and 179 normal subjects) and the high-classification performance, which was attained using only four features (left and right caudate, left and right putamen) and no feature selection procedure.

Taylor and Fenner [34] focused on the comparison of machine learning and semi-quantification methods to classify patients with movement disorders. Three machine learning algorithms were compared with a range of semi-quantification methods using data from the PPMI database (209 healthy controls and 448 PD subjects) and a locally derived dataset from Sheffield Teaching Hospitals NHS Foundation Trust (113 patients without PDD and 191 with PDD). Machine learning algorithms were based on SVM with three different sets of features: Voxel intensities, Principal components of image voxel intensities, Striatal binding radios from the putamen and caudate. Semi-quantification methods calculate striatal binding ratios (SBRs) from regions of interest applied to the full SPECT volume or selected slices, typically after automated registration to a chosen template and were generally based on from both putamina, with and without consideration of the caudates. Normal limits for the SBRs were defined according to four different methods: minimum of age-matched controls, mean minus 1/1.5/2 standard deviations from age-matched controls, linear regression of normal patient data against age (minus 1/1.5/2 standard errors), selection of the optimum operating point on the receiver operator characteristic curve from normal and abnormal training data. Each machine learning and semi-quantification technique was evaluated with stratified, nested tenfold cross-validation. The mean accuracy of the semi-quantitative methods for classifying local data into Parkinsonian and non-Parkinsonian groups varied from 0.78 to 0.87, between healthy controls and Parkinson’s disease groups from 0.89 to 0.95 for classifying PPMI data. The machine learning algorithms provided mean accuracy between 0.88–0.92 and 0.95–0.97 for local and PPMI data, respectively.

The authors concluded that classification performance for the local database was lower than for the research database for both semi-quantitative and machine learning algorithms. Furthermore, in both datasets, they found that the machine learning methods provided equal or higher mean accuracy (with lower variance) than any of the semi-quantification approach, although the gain from using machine learning algorithms was relatively small and might be not sufficient to offer a significant clinical improvement.

Finally, a previous paper by Prashant et al. [35] investigated the role of the symptoms developed in the premotor or prodromal phase in PD, that lasts at least for five years (and also for 20 years), between the onset of neurodegeneration and manifestation of classic clinical motor symptoms. During this phase, non-motor symptoms, such as Rapid Eye Movement sleep Behaviour Disorder (RBD) and olfactory loss, are present. However, despite none of these symptoms are sufficiently specific to be used for screening, they can be used in combination with other potential biomarkers, such as Cerebrospinal fluid (CSF) measurements and 123I-FP-CIT brain SPECT to discriminate subjects at risk of PD [36, 37]. Prashant et al. [35] used the non-motor features of RBD and olfactory loss, along with other significant biomarkers, such as cerebrospinal fluid (CSF) measurements and dopaminergic imaging markers (123I-FP-CIT brain SPECT), from 183 healthy normal and 401 early PD subjects (Hoehn and Yahr scale of 1 and 2), derived from the Parkinson's PPMI database [33], to classify subjects studied using Naïve Bayes, SVM, Boosted Trees and Random Forests classifiers. The authors showed that SVM classifier achieved the best performance (96.40% accuracy, 97.03% sensitivity, 95.01% specificity, and 98.88% area under ROC). Furthermore, the authors stressed that they obtained better accuracy than previous works using significant features corresponding to non-motor and imaging markers. The authors concluded that a combination of non-motor, CSF and imaging markers may be useful in the preclinical diagnosis of PD.

Brain PET

Positron-emitting pre-synaptic tracers can be useful to investigate movement disorders, even if they are less widely used compared with analogous, gamma-emitting radiocompounds. Few papers are available on the use of AI to diagnose early PD with positron-emitting pre-synaptic tracers.

In an interesting work, Glaab et al. integrated blood metabolomics and PET brain neuroimaging data to diagnose Parkinson's disease [23]. Previous studies on metabolome changes in PD had mainly focused on systemic metabolic profiles alone, while the authors proposed a combination with other modalities, such as PET, to provide information about the relationship between local and systemic changes, both in terms of diagnostic usefulness and understanding of pathophysiology. The authors investigated whether the joint data analysis of blood metabolomics and PET imaging by machine learning was able to provide increased diagnostic discrimination and further pathophysiological insights. Blood plasma samples were collected from 60 PD patients and 15 age- and gender-matched healthy controls to determine metabolomic profiles by gas chromatography coupled to mass spectrometry (GC–MS). The metabolomics profiles of 71 detected metabolites were compared at study baseline between the PD patients and the unaffected controls. Levels of the unknown metabolite RI1446 showed a significant increase (FDR < 0.05). Suggestive increase (here defined as nominal p value < 0.05 and FDR < 0.5) was found for urea, hexadecanoic acid and dodecanoic acid, and the unknown metabolite RI1050. In the same cohort and at the same time, 18F-DOPA PET was performed in 44 patients and 14 controls and FDG PET was carried out in 51 patients and 16 controls. Eighteen PD patients had a follow-up exam after 1 year. Both data sets were processed using two machine learning approaches, linear support vector machines and random forests with a leave-one-out cross-validation scheme. As results in the metabolomics data, the baseline comparison between patients and controls and the follow-up assessment of patients pointed to metabolite changes associated with oxidative stress and inflammation. Considering 18F-DOPA and FDG PET data, the diagnostic predictive performance (DPP) in the ROC analyses was highest if imaging features with metabolomics data were combined (ROC AUC for best FDOPA + metabolomics model: 0.98; AUC for best FDG + metabolomics model: 0.91). DPP was lower if only PET attributes or only metabolomics signatures were used. The authors concluded that the integration of blood metabolomics data and PET data remarkably increased the diagnostic discrimination power.

A further interesting paper by Wu et al. [25] suggested that radiomics features could describe the brain tissue heterogeneity in PD to differentiate between patients and normal controls. Radiomics allows the extraction of a large number of high-level features from medical images using high-throughput calculations. Such features reflect the potential pathophysiological information useful for clinical diagnosis. In a retrospective multi-cohort study, Wu et al. extracted radiomics features from 18F-FDG PET images and collected clinical scale results [the Unified Parkinson’s Disease Rating Scale (UPDRS) and Hoehn & Yahr scale (H&Y)] of two cohorts of patients included to document the stability and applicability of the proposed radiomic methods. The study was conducted on different samples from different PET scanners with different imaging characteristics. The cohort from Huashan Hospital consisted of 91 normal controls (NC) and 91 PD patients (UPDRS: 22.7 ± 11.7, H&Y: 1.8 ± 0.8), while the other cohort from Wuxi 904 Hospital included 26 NC and 22 PD patients (UPDRS: 20.9 ± 11.6, H&Y: 1.7 ± 0.9). The Huashan cohort was used as the training and test sets by fivefold cross-validation, whilst the Wuxi cohort was used as an independent test set. Regions of interests (ROIs) were identified through the atlas-based method; radiomic features were extracted and selected using autocorrelation and Fisher score algorithm. A support vector machine (SVM) was trained to classify PD and NC based on the selected radiomic features. In the experiments, the authors compared their method with the traditional voxel values method. To increase robustness, the processes were repeated in 500 times. As results, 26 brain ROIs were identified and 610 total radiomic features were extracted. Only 30 out of the 610 features remained after feature selection. The accuracy of the proposed method was 90.97% ± 4.66% and 88.08% ± 5.27% in Huashan and Wuxi test sets, respectively. The authors concluded that high-order radiomic features extracted from 18F-FDG PET brain images provided a valuable computer-aided diagnosis of PD, as the radiomic features for diagnosis resulted more accurate than traditional voxel values. Finally, they suggested that future research may focus on the use of high-order radiomic features as quantitative biomarkers for early diagnosis of PD.

In a recent paper, Choi et al. [38] developed a deep learning-based cognitive signature of FDG brain PET suitable for PD and AD. A deep CNN (convolutional neural network) model to differentiate AD and normal controls (NCs) was built by a training set composed of 636 FDG PET scans obtained from Alzheimer’s disease Neuroimaging Initiative (ADNI) database. The model was then transferred to images of 666 patients with mild cognitive impairment (MCI) to identify who would rapidly convert to AD and it was tested in a further independent cohort including 62 PD patients to identify PD patients with dementia. The model accuracy was measured by the area under curve (AUC). The relationship between all images was visualized by two-dimensional projection of the deep learning-based features (128 for each patient) and the model was also designed to predict cognitive score of the examined individuals and validated in PD patients. Feature maps of the deep CNN model allowed to visualize cognitive dysfunction-related regions.

In the “Results” section, the authors showed that AUC of ROC discriminating AD and NC was 0.94 (95% CI 0.89–0.98). The transferred model was able to identify MCI patients converting to AD (AUC = 0.82) and PD with dementia (AUC = 0.81). The two-dimensional projection mapping visualized the degree of cognitive dysfunction compared with normal brains independently of different disease cohorts. Brain metabolic features related to the cognitive dysfunction were localized on individual FDG PET images. The cognitive dysfunction-related regions of individuals were different among and cognitive dysfunction-related regions were obtained for three PD patients with dementia. The regions included cingulate and high frontal/parietal cortices, even though cognitive dysfunction-related regions of the first patient were posterior cingulate area, those of the second patient were the superior frontal and parietal cortices and those of the third patient were the cingulate and superior frontoparietal cortices. Cognitive dysfunction-related regions were averaged across AD patients and PD patients with dementia, respectively, and the averaged regions were cingulate and superior frontal/parietal cortices.

Mini-mental status exam scores highly correlated with predicted cognitive score, an output of the model. Finally, the authors concluded that the deep learning-based cognitive function evaluation model could be successfully transferred to multiple disease domain, thus suggesting that this approach could be extended to an objective cognitive signature able to provide quantitative biomarker for cognitive dysfunction in different neurodegenerative disorders, such as AD and PD.

The main results of the papers presented in this review and carried out using AI techniques applied to nuclear medicine modalities in early PD, are reported in Table 2.

Table 2 Characteristics and main results of the papers about the early diagnosis of PD

Differential diagnosis between PD and other Parkinsonian syndromes

Differential diagnosis between PD and other Parkinsonian disorders, such as Multiple System Atrophy (MSA), Progressive Supranuclear Palsy (PSP) and Corticobasal Degeneration (CBD), in addition to vascular Parkinsonism and drug-induced PD is also a challenging task. This is due to the possibility to effectively treat PD using l-Dopa or Dopamine-agonists, while other Parkinsonian syndromes do not have a valid clinical response. 123I-FP-CIT brain SPECT can help discriminate degenerative Parkinsonian syndromes from Essential Tremor, psychogenic and vascular Parkinsonism, as these nosological entities have a normal nigrostriatal function [14, 19, 39]. However, 123I-FP-CIT brain SPECT alone is not sufficient to differentiate between the various types of degenerative Parkinsonian syndromes.

It has been shown that neurodegenerative diseases share a common pathogenetic molecular mechanism characterized by increased accumulation of different, non-degraded, normal proteins in specific neuronal population and in glial cells, thus leading to the definition of proteinopathies, whose pathology is mediated by the type of proteins and by the anatomical localization of protein deposits [19]. Proteinopathies include Tauopathies and Alpha-synucleinopathies. Tauopathies include Alzheimer’s disease (AD), fronto-temporal dementia with parkinsonism linked to tau mutation on chromosome 17 (FTDP-17T), Pick disease, PSP and CBD and they are characterized by intracellular inclusion of hyperphosphorylated and aggregated tau proteins in the form of neurobrillary tangles. Alpha-synucleinopathies include PD, PD with dementia (PDD), Lewy Body Dementia (LBD) and MSA, showing the presence of aggregates of phosphorylated alpha-synuclein protein called Lewy Bodies. However, some authors have suggested that alpha-synucleinopathies and tauopathies could not be possibly considered as isolated categories due to the dubious or mixed clinical symptoms—especially at the early stages—although the histopathological classification of the neurodegenerative diseases is based on the nature and localization of these deposits in the nervous system [19, 40].

The combination of 123I-FP-CIT brain SPECT and 123I-MIBG myocardial scintigraphy is currently considered a valuable strategy for differential diagnosis between PD and other degenerative and vascular Parkinsonian syndromes [19, 34]. Indeed 123I-MIBG myocardial scintigraphy makes it possible to assess the sympathetic cardiac nerve terminals, as PD subjects have a decreased cardiac MIBG uptake compared with other Parkinsonian syndromes where 123I-MIBG uptake is usually normal [40, 41].

123I-MIBG scintigraphy and post-synaptic radiopharmaceuticals

In a recent paper by our group [20], we performed MIBG myocardial scintigraphy in 85 subjects with Parkinsonian syndrome. Fifty of them had idiopathic Parkinson’s disease, 26 atypical Parkinsonian syndromes (P) and nine essential tremor. Images were evaluated both qualitatively and quantitatively by drawing on planar early and delayed images H/M ratio obtained from the regions of interest (ROIt1 and ROIt2). The numerical data obtained by semi-quantitative analysis were further processed using two machine learning techniques (SVM and RF) to obtain reliable and reproducible heart/mediastinum (H/M) ratio cut-off values for Parkinsonian disorders. SVM classifier characteristics were previously described [15].

Let us just recall here that a Random Forest classifier is an ensemble classifier which uses a set of classification trees to estimate the output (diagnostic) class as function of the input features. In RF, each basic classifier (ClT) proposes a diagnostic class (vote) and the finally output is the class label with most votes [42, 43]. In Nuvoli et al. [20] both SVM and RF classifiers were used to obtain the cut-off values.

The results indicate that SVM and RF attained excellent classification performances, with SVM providing the highest accuracy followed by RF. The optimal H/M cut-off value was 1.55, as it remained the same for both ROIt1 and ROIt2, and allowed to correctly classify PD from Parkinsonian syndromes and ET. If patients had H/M < 1.55, they were diagnosed as PD; if they had values ≥ 1.55, they were classified as Parkinsonism and/or ET. No difference was found when early or late H/M ratios were considered separately. The authors concluded that the use of SVM and CT allowed to define the optimal H/M cut-off value either in early or in delayed phase, thus confirming the pivotal role played by [123I]MIBG cardiac scintigraphy and the effectiveness of H/M ratio in the differential diagnosis among PD, ET and other Parkinsonism. An interesting result was that early scans alone could be used for a reliable diagnosis, as no difference in accuracy emerged from using early or late scans, albeit this finding needs to be confirmed in larger studies. To our knowledge, this was the first study investigating the role of automatic classifiers to diagnose Parkinsonian syndromes basing on data derived from [123I]MIBG cardiac scintigraphy.

A paper by Segovia et al. [44], investigated the role of the positron-emitting radiopharmaceutical 18F-Desmethoxyfallypride [18F]DMFP, which enables imaging of the postsynaptic striatal dopaminergic deficit that characterizes non-idiopathic Parkinsonian variants, such as multiple system atrophy (MSA) or progressive supranuclear palsy (PSP). The authors used an SVM classifier on pre-processed [18F]DMFP PET data from 87 scans collected in a longitudinal study carried at the University of Munich [45] to distinguish the idiopathic and non-idiopathic Parkinsonian patients (i.e., PD vs. MSA and PSP). Since delimitation of the regions of interest in DMFP-PET images is of great importance for improving the automatic diagnosis of PD, the authors proposed a novel methodology to pre-process 18F-DMFP-PET scan data to increase the accuracy of computer-aided diagnosis systems. The data were segmented using an algorithm based on Hidden Markov Random Field. As a result, each neuroimage was divided into four maps according to the intensity and the neighborhood of the voxels and these maps were then normalized to a Gaussian distribution with equal parameters for all the neuroimages. After these pre-processing steps, an SVM classifier was used to discriminate idiopathic and non-idiopathic PD. Data pre-processed using this method provided higher accuracy compared with those pre-processed with previously applied approaches. A previous paper of this group [46] evaluated the ability of [18F]DMFP PET scan in the same dataset of patients to distinguish Parkinson’s disease from atypical Parkinsonian syndromes via an automated system based on support vector machines and Bayesian networks. Although the approaches proposed in the above papers are very promising, they have one main disadvantage— i.e., the use of a rather uncommon radiotracer, which makes the methods difficult to translate into the clinical practice.

Dopamine Transporters (DAT) imaging

Although 123I-FP-CIT brain SPECT is considered scarcely useful for the differential diagnosis of PD and degenerative Parkinsonian syndromes, some papers investigated the role of this diagnostic tool for this task by analyzing data obtained via AI techniques to improve its clinical usefulness. In this direction, Badoud et al. [47] employed advanced image analysis techniques to discriminate between PD and different atypical Parkinsonian syndromes (APS) by evaluating data derived from 392 consecutive patients with degenerative Parkinsonism undergoing 123I-FP-CIT brain SPECT. This patient series included 306 PD, 24 multiple system atrophy (MSA), 32 progressive supranuclear palsy (PSP) and 30 corticobasal degeneration (CBD) subjects. Data analysis was based on voxel-wise univariate statistical parametric mapping and multivariate pattern recognition through linear discriminant classifiers. MSA and PSP showed less radiopharmaceutical uptake in the head of caudate nucleus comparing with PD and CBD, while no difference was observed between MSA and PSP. CBD had higher uptake in both putamen with respect to PD, MSA and PSP. Classification resulted significant for PD versus APS (AUC 0.69, p < 0.05) and between APS categories (MSA vs CBD AUC 0.80, p < 0.05; MSA vs PSP AUC 0.69 p < 0.05; CBD vs PSP AUC 0.69 p < 0.05).

It is to note that the authors, based on the assumption that PD and APS might also affect the 123I-FP-CIT uptake outside the striatum, carried out the classification analysis three times: within the striatum only, in the remaining brain outside the striatum, and on the entire brain. Even if the first proved the most informative for the classification, the other two also provided significant results, setting into evidence that dopaminergic impairment of extrastriatal brain regions in PD and APS can be evaluated by advanced analysis techniques of molecular imaging. Some studies disclosed an affinity of 123I-Ioflupane for noradrenergic and serotoninergic receptors, especially within extrastriatal areas [48]. Therefore, non-dopaminergic uptake has to be taken into account to evaluate results of Badoud, showing that also extra-striatal regions provide relevant information to diagnose PD and other extrapyramidal syndromes, thus suggesting as a challenge for future research to investigate to what extent the discriminative information relies on textural feature.

However, the authors concluded that, although either striatal or extra-striatal regions contain classification information, the combination of both regions did not significantly improve the classification accuracy. Finally, the paper set into evidence that PD, MSA, PSP and CBD presented different patterns of dopaminergic depletion as investigated by 123I-FP-CIT SPECT. The high specificity of 84–90% for PD versus APS allowed the authors to conclude that the classifier was particularly useful to confirm atypical Parkinsonian syndromes.

Surely one point of strength of this work is the large population of subjects examined (n = 392 patients). Moreover, all the patients were scanned consecutively on the same machine using the same protocol of acquisition and, to avoid operator-related variability, all the post-processing procedures were performed in an operator-independent manner including the creation of a study specific template. Sex and age data were used as non-explanatory co-regressors to compensate the impact on the nigrostriatal pathway. One possible limitation, however, is the retrospective nature of the study, although the long follow-up permitted some diagnostic reallocation, representing an advantage for the aim of this diagnosis-oriented study. Finally, the authors showed that, as neuropathological data were not available for most of patients, the diagnosis of PD and APS was clinical and based on well-established diagnostic criteria. In our opinion, this is the standard of the majority of the study carried out in these neurodegenerative groups, particularly in a so large cohort of individuals examined.

A further interesting study on this topic of Nicastro et al. [49] presented an automated classification method to diagnose degenerative Parkinsonian syndrome based on semi-quantitative 123I-FP-CIT SPECT striatal indices and SVM analysis. 123I-FP-CIT SPECT was carried out at a single-center level in 370 patients of whom 280 with Parkinson’s disease (PD), 21 with multiple system atrophy-Parkinsonian type (MSA-P), 41 with progressive supranuclear palsy (PSP), 28 with corticobasal syndrome (CBS) and 208 age- and gender-matched control subjects (mean age of the whole population 70.3 years, 47% female, mean disease duration at scan 1.4 years). Striatal volumes-of-interest (VOIs) uptake, VOIs asymmetry indices (AIs) and caudate/putamen (C/P) ratio were fed into an SVM for individual classification and accuracy was estimated through fivefold cross-validation. Univariate analysis showed significantly lower VOIs uptake, higher striatal AI and C/P ratio for each Parkinsonian syndrome compared with the controls (all p < 0.001). MSA-P and PSP disclosed the higher degree of striatal impairment, while CBS presented moderate uptake decrease and higher AI. The accuracy of binary classification of Parkinsonian syndromes versus control was 92.9%, that of pairwise binary classification between different Parkinsonian syndromes between 62.9% and 83.7%. The best results were obtained separating CBS from the other syndromes. Sensitivity and specificity values ranged between 60 and 80% for all the analyses. Striatal AI and C/P ratio on the more affected side had the highest weighting factors. In conclusion, the study determined that semi-quantitative 123I-FP-CIT SPECT striatal evaluation combined with SVM represented a challenging approach to discriminate PD from non-degenerative conditions and from atypical Parkinsonian syndromes at early stage.

Finally, a further paper by Huertas-Fernández et al. [50] has to be cited, which is about differentiating between vascular Parkinsonism and PD by machine learning models. Vascular Parkinsonism (VP) is a Parkinsonian syndrome determined by cerebrovascular lesions and is clinically characterized by gait difficulties, symmetrical lower body bradykinesia, postural instability and absence of resting tremor [50, 51]. Although there are hallmarks to differentiate between VP and idiopathic Parkinson’s disease (PD), an overlap in symptom presentation may occur and the differential diagnosis among these conditions is not always easy [50, 52]. The aim of the study of Huertas-Fernández [50] was to develop diagnostic predictive models using data from two commonly used [123I]FP-CIT SPECT assessment methods: region-of-interest (ROI) analysis and whole-brain voxel-based analysis. Eighty retrospective patients with VP and 164 patients with PD undergoing [123I]FP-CIT SPECT were examined. Nuclear medicine specialists evaluated the images and calculated bilateral caudate and putamen [123I]FP-CIT uptake and asymmetry indices using BRASS software. Statistical parametric mapping (SPM) was performed to compare the radiolig and uptake between the two diseases at the voxel level. Quantitative data obtained by these two methods, in addition to potential confounding factors for dopamine transporter availability (sex, age, disease duration and severity), were used to build predictive models. The performance of logistic regression (LR), linear discriminant analysis and support vector machine (SVM) algorithms for ROI data, and their penalized versions for SPM data (penalized LR, penalized discriminant analysis and SVM), were evaluated via tenfold cross-validation. Significant differences were observed in the ROI analysis after covariate correction between VP and PD patients in [123I]FP-CIT uptake in the more affected side of the putamen and the ipsilateral caudate. Age, disease duration and severity were also found to be informative to build the statistical model. SPM analysis showed significant reductions of [123I]FP-CIT uptake in PD compared with VP in two specular clusters including areas corresponding to the left and right striatum. The LR model using ROI data reached a diagnostic predictive accuracy of 90.3% and the SVM model using SPM data of 90.4%. The predictive models built with ROI data and SPM data deriving from [123I]FP-CIT SPECT therefore showed very good discrimination accuracy between VP and PD. However, the authors suggested that an external validation of the proposed methods was necessary to confirm their applicability across different centres.

The main characteristics of the papers presented in this review and carried out using AI techniques applied to nuclear medicine modalities to differentiate PD and other Parkinsonian syndromes, are reported in Table 3.

Table 3 Characteristics and main results of the papers about differential diagnosis between PD and other Parkinsonian syndromes

Discussion and conclusion

In the era of big data precision, medicine is a rapidly evolving field aiming at finding associations between quantitative features extracted from imaging data and clinical outcome. This has triggered intense research activity in the last few years, so much so that completely new topics—if not discipline—have emerged, as for instance, radiomics. It is believed that computer-assisted analysis of quantitative information from imaging data may complement standard radiological interpretation and, therefore, help improve diagnostic assessment in a number of disorders. The rationale behind this is that computerised method could capture and analyse complex patterns that would otherwise go unnoticed to the human eye.

In particular, here is increasing evidence that AI techniques applied to nuclear medicine modalities can play a major role in the differential diagnosis of movement disorders. This parallels the increasing research interest that AI is attracting in other areas, such as nuclear medicine, radiology, oncology, neurology and histopathology [5, 6, 10, 53,54,55,56]. As the late Prof. Lucignani effectively underlined in a recent paper [1], “quantitative methods for assessing the status of imaging biomarkers are entering the diagnostic workflow, most likely, the human visual interpretation of images will become insufficient and possibly inadequate for good clinical practice and research. Although there will always be a qualitative aspect to imaging, AI and quantitative metrics will supplement and complement the current “human” methods of interpretation of imaging data in a holistic approach to individual patient management”.

In this scenario, AI techniques have the potential not only to improve the clinical classification in the diagnosis of movement disorders, but also to help discriminate among different pathological entities, therefore paving the way towards personalized medicine.

Although there is currently large consensus about the potential advantages related to the use of artificial intelligence in nuclear medicine for the management of Parkinson's and Parkinsonian syndromes, there are still some obstacles before these methodologies can be translated into clinical practice. One crucial step towards this end is standardisation: there is still large variability in the implementation of the overall pipeline—as regards, for instance, image acquisition, pre-processing, feature extraction and data analysis. It is necessary to define and adopt internationally recognised guidelines and promote large, multi-centric prospective studies.