Abstract
Bipolar disorder is a serious psychiatric disorder characterized by periodic episodes of manic and depressive symptomatology. Due to the high percentage of people suffering from severe bipolar and depressive disorders, the modelling, characterisation, classification and diagnostic analysis of these mental disorders are of vital importance in medical research. Electroencephalogram (EEG) records offer important information to enhance clinical diagnosis and are widely used in hospitals. For this reason, EEG records and patient data from the Virgen de la Luz Hospital were used in this work. In this paper, an extreme gradient boosting (XGB) machine learning (ML) method involving an EEG signal is proposed. Four supervised ML algorithms including a k-nearest neighbours (KNN), decision tree (DT), Gaussian Naïve Bayes (GNB) and support vector machine (SVM) were compared with the proposed XGB method. The performance of these methods was tested implementing a standard 10-fold cross-validation process. The results indicate that the XGB has the best prediction accuracy (94%), high precision (\(>0.94\)) and high recall (\(>0.94\)). The KNN, SVM, and DT approaches also present moderate prediction accuracy (\(>87\)), moderate recall (\(>0.87\)) and moderate precision (\(>0.87\)). The GNB algorithm shows relatively low classification performance. Based on these results for classification performance and prediction accuracy, the XGB is a solid candidate for a correct classification of patients with bipolar disorder. These findings suggest that XGB system trained with clinical data may serve as a new tool to assist in the diagnosis of patients with bipolar disorder.
Similar content being viewed by others
Availability of data and material
The datasets generated and/or analysed during the present study are not publicly available because the patients have not given permission for these data to be openly published. They have only given permission for publication of the results, but they are available from the corresponding author upon reasonable request.
References
A. Accardo, M. Affinito, M. Carrozzi, F. Bouquet, Use of the fractal dimension for the analysis of electroencephalographic time series. Biol. Cybern. 77(5), 339–350 (1997)
R. Acharya, O. Faust, N. Kannathal, T. Chua, S. Laxminarayan, Non-linear analysis of EEG signals at various sleep stages. Comput. Methods Programs Biomed. 80(1), 37–45 (2005)
F. Alimardani, J. Cho, R. Boostani, H. Hwang, Classification of bipolar disorder and schizophrenia using steady-state visual evoked potential based features. IEEE Access 6, 40379–40388 (2018)
A. Alkan, M.K. Kiymik, Comparison of AR and welch methods in epileptic seizure detection. J. Med. Syst. 30(6), 413–419 (2006)
J. Angst, The emerging epidemiology of hypomania and bipolar II disorder. J. Affect. Disord. 50(2), 143–151 (1998)
J. Arribas, V. Calhoun, T. Adali, Automatic bayesian classification of healthy controls, bipolar disorder, and schizophrenia using intrinsic connectivity maps from fmri data. IEEE Trans. Bio-Med. Eng. 57(12), 2850–2860 (2010)
G. Belizario, R. Junior, R. Salvini, B. Lafer, R. Dias, Predominant polarity classification and associated clinical variables in bipolar disorder: A machine learning approach. J. Affect. Disord. 245, 279–282 (2019)
H. Birnbaum, L. Shi, E. Dial, E. Oster, P. Greenberg, D. Mallett, Economic consequences of not recognizing bipolar disorder patients: A cross-sectional descriptive analysis. J. Clin. Psychiatry 64(10), 1201–1209 (2003)
W. Chang, Y. Liu, X. Wu, Y. Xiao, S. Zhou, W. Cao, A new hybrid XGBSVM model: Application for hypertensive heart disease. IEEE Access 7, 175248–175258 (2019)
T. Chen, C. Guestrin. XGBoost: A scalable tree boosting system, in Proceedings of the 22nd acm SIGKDD international conference on knowledge discovery and data mining, pages 785–794, (2016)
W. Chen, K. Fu, J. Zuo, X. Zheng, T. Huang, W. Ren, Radar emitter classification for large data set based on weighted-xgboost. IET Radar, Sonar Navigation 11(8), 1203–1207 (2017)
B.K. Das, H.S. Dutta, GFNB: Gini index-based fuzzy naive bayes and blast cell segmentation for leukemia detection using multi-cell blood smear images. Med. Biol. Eng. Comput. 58(11), 2789–2803 (2020)
M.S. Esfahani, E.R. Dougherty, Incorporation of biological pathway knowledge in the construction of priors for optimal bayesian classification. IEEE/ACM Trans. Comput. Biol. Bioinf. 11(1), 202–218 (2014)
R. Esteller, G. Vachtsevanos, J. Echauz, B. Litt, A comparison of waveform fractal dimension algorithms. IEEE Trans. Circuits Syst. I Fundamen. Theory Appl. 48(2), 177–183 (2001)
T. Fawcett, An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)
A.S.P. Geethanjali, DWT based detection of epileptic seizure from EEG signals using Naive Bayes and k-NN classifiers. IEEE Access 4, 7716–7727 (2016)
F. Hajipour, M.J. Jozani, Z. Moussavi, A comparison of regularized logistic regression and random forest machine learning models for daytime diagnosis of obstructive sleep apnea. Med. Biol. Eng. Comput. 58(10), 2517–2529 (2020)
J. Han, J. Pei, M. Kamber. Data mining: concepts and techniques. Third Edition. (2016)
P. He, B. Fan, X. Xu, J. Ding, Y. Liang, Y. Lou, Z. Zhang, X. Chang, Group K-SVD for the classification of gene expression data. Comput. Electr. Eng. 76, 143–153 (2019)
B. Hosseinifard, M.H. Moradi, R. Rostami, Classifying depression patients and normal subjects using machine learning techniques and nonlinear features from EEG signal. Comput. Methods Programs Biomed. 109(3), 339–345 (2013)
M. Jospin, P. Caminal, E.W. Jensen, H. Litvan, M. Vallverdú, M.M. Struys, H.E. Vereecke, D.T. Kaplan, Detrended fluctuation analysis of EEG as a measure of depth of anesthesia. IEEE Trans. Biomed. Eng. 54(5), 840–846 (2007)
M. Kafai, K. Eshghi, Croification: accurate kernel classification with the efficiency of sparse linear SVM. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 34–48 (2017)
N.S. Kumar, J. Mahil, A. Shiji, K.P. Joshua, Detection of autism in children by the EEG behavior using hybrid bat algorithm-based ANFIS classifier. Circuits Syst. Signal Process. 39(2), 674–697 (2020)
T.S. Kumar, D. Kumutha, Comparative analysis of the fuzzy c-means and neuro-fuzzy systems for detecting retinal disease. Circuits Syst. Signal Process. 39(2), 698–720 (2020)
G.J. Lal, E. Gopalakrishnan, D. Govind, Glottal activity detection from the speech signal using multifractal analysis. Circuits Syst. Signal Process. 39(4), 2118–2150 (2020)
D. Li, H. Zhang, M. Zhang, Wavelet de-noising and genetic algorithm-based least squares twin SVM for classification of arrhythmias. Circuits Syst. Signal Process. 36(7), 2828–2846 (2017)
Y. Luo, B. Wang, Prediction of negative conversion days of childhood nephrotic syndrome based on PCA and BP-adaboost neural network. IEEE Access 7, 151579–151586 (2019)
B. Ma, F. Meng, G. Yan, H. Yan, B. Chai, F. Song, Diagnostic classification of cancers using extreme gradient boosting algorithm and multi-omics data. Comput. Biol. Med. 121, 103761 (2020)
J. Mateo-Sotos, A. Torres, E.V. Sánchez-Morla, J. Santos, An adaptive radial basis function neural network filter for noise reduction in biomedical recordings. Circuits Syst. Signal Process. 35(12), 4463–4485 (2016)
L. Matza, K. Rajagopalan, C. Thompson, G. de Lissovoy, Misdiagnosed patients with bipolar disorder: Comorbidities, treatment patterns, and direct treatment costs. J. Clin. Psychiatry 66(11), 1432–1440 (2005)
J.S. McCombs, J. Ahn, T. Tencer, L. Shi, The impact of unrecognized bipolar disorders among patients treated for depression with antidepressants in the fee-for-services california medicaid (medi-cal) program: A 6-year retrospective analysis. J. Affect. Disord. 97(1), 171–179 (2007)
N.M.M. Nascimento, L.B. Marinho, S.A. Peixoto, J.P. do ValeMadeiro, V.H.C. de Albuquerque, Heart arrhythmia classification based on statistical moments and structural co-occurrence. Circuits Syst. Signal Process. 39(3), 631–650 (2020)
T. Nguyen-Ky, P. Wen, Y. Li, Monitoring the depth of anaesthesia using hurst exponent and bayesian methods. IET Signal Process. 8(9), 907–917 (2014)
A. Ogunleye, Q.G. Wang, XGBoost model for chronic kidney disease diagnosis. IEEE/ACM Trans. Comput. Biol. Bioinf. 17(6), 2131–2140 (2019)
F. Perrin, J. Pernier, O. Bertrand, J. Echallier, Spherical splines for scalp potential and current density mapping. Electroencephalogr. Clin. Neurophysiol. 72(2), 184–187 (1989)
S.M. Pincus, W.-M. Huang, Approximate entropy: statistical properties and applications. Commun. Stat. Theory Methods 21(11), 3061–3077 (1992)
Z. Que, Z. Xu, A data-driven health prognostics approach for steam turbines based on XGBoost and DTW. IEEE Access 7, 93131–93138 (2019)
A. Rehman, S. Naz, M.I. Razzak, F. Akram, M. Imran, A deep learning-based framework for automatic brain tumors classification using transfer learning. Circuits Syst. Signal Process. 39(2), 757–775 (2020)
J.S. Richman, J.R. Moorman, Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol.-Heart Circ. Physiol. 278(6), H2039–H2049 (2000)
Z. Rihmer, K. Kiss, Bipolar disorders and suicide risk. Clin. Appr. Bipol. Disord. 1, 1–21 (2002)
R. Rivera-Lopez, J. Canul-Reich, Construction of near-optimal axis-parallel decision trees using a differential-evolution-based approach. IEEE Access 6, 5548–5563 (2018)
N. Sairamya, S.T. George, D.N. Ponraj, M. Subathra, Detection of epileptic EEG signal using improved local pattern transformation methods. Circuits Syst. Signal Process. 37(12), 5554–5575 (2018)
J.A.M. Saucedo, J.D. Hemanth, U. Kose, Prediction of electroencephalogram time series with electro-search optimization algorithm trained adaptive neuro-fuzzy inference system. IEEE Access 7, 15832–15844 (2019)
H. Shi, H. Wang, Y. Huang, L. Zhao, C. Qin, C. Liu, A hierarchical method based on weighted extreme gradient boosting in ECG heartbeat classification. Comput. Methods Programs Biomed. 171, 1–10 (2019)
P. Sodmann, M. Vollmer, N. Nath, L. Kaderali, A convolutional neural network for ECG annotation as the basis for classification of cardiac rhythms. Physiol. Meas. 39(10), 104005 (2018)
L. Sörnmo, P. Laguna, Bioelectrical signal processing in cardiac an neurological applications (Elsevier Academic Press, Amsterdam, 2005)
L. Torlay, M. Perrone Bertolotti, E. Thomas, M. Baciu, Machine learning XGBoost analysis of language networks to classify patients with epilepsy. Brain Inf. 4(3), 159–169 (2017)
A. Torres, J. Mateo, M.A. García, J. Santos, Cancellation of powerline interference from biomedical signals using an improved affine projection algorithm. Circuits Syst. Signal Process. 34(4), 1249–1264 (2015)
H.C. Tunc, C.O. Sakar, H. Apaydin, G. Serbes, A. Gunduz, M. Tutuncu, F. Gurgen, Estimation of parkinson’s disease severity using speech features and extreme gradient boosting. Med. Biol. Eng. Comput. 58(11), 2757–2773 (2020)
Y. Wang, S. Xia, Q. Tang, J. Wu, X. Zhu, A novel consistent random forest framework: Bernoulli random forests. IEEE Trans. Neural Netw. Learn. Syst. 29(8), 3510–3523 (2018)
A. Wierdsma, S. Sytema, J. van Os, C. Mulder, Case registers in psychiatry: do they still have a role for research and service monitoring? Curr. Opin. Psychiatry 21(4), 379–384 (2008)
F. Xiao, Y. Wang, L. He, H. Wang, W. Li, Z. Liu, Motion estimation from surface electromyogram using adaboost regression and average feature values. IEEE Access 7, 13121–13134 (2019)
W. Xing, Y. Bei, Medical health big data classification based on KNN classification algorithm. IEEE Access 8, 28808–28819 (2019)
C. Ye, T. Fu, S. Hao, Y. Zhang, O. Wang, B. Jin, M. Xia, M. Liu, X. Zhou, Q. Wu et al., Prediction of incident hypertension within the next year: prospective study using statewide electronic health records and machine learning. J. Med. Internet Res. 20(1), e22 (2018)
J.M. Yentes, N. Hunt, K.K. Schmid, J.P. Kaipust, D. McGrath, N. Stergiou, The appropriate use of approximate entropy and sample entropy with short data sets. Ann. Biomed. Eng. 41(2), 349–365 (2013)
B. Yu, W. Qiu, C. Chen, A. Ma, J. Jiang, H. Zhou, Q. Ma, Submito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and extreme gradient boosting. Bioinformatics 36(4), 1074–1081 (2020)
S. Yu, X. Li, X. Zhang, H. Wang, The OCS-SVM: An objective-cost-sensitive SVM with sample-based misclassification cost invariance. IEEE Access 7, 118931–118942 (2019)
C. Yücelbaş, Ş Yücelbaş, S. Özşen, G. Tezel, S. Küççüktürk, Ş Yosunkaya, A novel system for automatic detection of k-complexes in sleep EEG. Neural Comput. Appl. 29(8), 137–157 (2018)
S. Zhang, X. Li, M. Zong, X. Zhu, R. Wang, Efficient KNN classification with different numbers of nearest neighbors. IEEE Trans. Neural Netw. Learn. syst. 29(5), 1774–1785 (2017)
Y. Zhang, G. Zhou, J. Jin, Q. Zhao, X. Wang, A. Cichocki, Sparse bayesian classification of EEG for brain-computer interface. IEEE Trans. Neural Netw. Learn. syst. 27(11), 2256–2267 (2016)
J. Zhong, Y. Sun, W. Peng, M. Xie, J. Yang, X. Tang, XGBFEMF: an XGBoost-based framework for essential protein prediction. IEEE Trans. NanoBiosci. 17(3), 243–250 (2018)
W. Zhou, Y. Liu, Q. Yuan, X. Li, Epileptic seizure detection using lacunarity and bayesian linear discriminant analysis in intracranial EEG. IEEE Trans. Biomed. Eng. 60(12), 3375–3381 (2013)
X. Zhou, N.A. Obuchowski, D.K. McClish, Statistical methods in diagnostic medicine, 2nd edn. (Wiley, New York, 2011)
Acknowledgements
This work was sponsored by Virgen de la Luz Hospital of Cuenca (Spain) and Institute of Technology (University of Castilla-La Mancha).
Author information
Authors and Affiliations
Contributions
All the authors have participated in the development of the article.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Code availability
The code generated and/or analysed during the present study are not publicly available because the patients have not given permission for these code to be openly published. They have only given permission for publication of the results, but they are available from the corresponding author upon reasonable request.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mateo-Sotos, J., Torres, A.M., Santos, J.L. et al. A Machine Learning-Based Method to Identify Bipolar Disorder Patients. Circuits Syst Signal Process 41, 2244–2265 (2022). https://doi.org/10.1007/s00034-021-01889-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-021-01889-1