Elsevier

Neurocomputing

Volume 136, 20 July 2014, Pages 103-123
Neurocomputing

EEG signal classification for epilepsy diagnosis via optimum path forest – A systematic assessment

https://doi.org/10.1016/j.neucom.2014.01.020Get rights and content

Highlights

  • Thorough assessment of the OPF classifier when coping with the task of epilepsy diagnosis through EEG signal classification.

  • Analysis of the impact of the data preprocessing methods as well as the distance measure used.

  • Comparison of the OPF classifier with SVM-RBF, ANN-MLP, and Bayesian classifiers.

  • The OPF classifier has outperformed the other models in terms of efficiency and effectiveness criteria.

Abstract

Epilepsy refers to a set of chronic neurological syndromes characterized by transient and unexpected electrical disturbances of the brain. The detailed analysis of the electroencephalogram (EEG) is one of the most influential steps for the proper diagnosis of this disorder. This work presents a systematic performance evaluation of the recently introduced optimum path forest (OPF) classifier when coping with the task of epilepsy diagnosis directly through EEG signal analysis. For this purpose, we have made extensive use of a benchmark dataset composed of five classes, whose full discrimination is very hard to achieve. Four types of wavelet functions and three well-known filter methods were considered for the tasks of feature extraction and selection, respectively. Moreover, support vector machines configured with radial basis function (SVM-RBF) kernel, multilayer perceptron neural networks (ANN-MLP), and Bayesian classifiers were used for comparison in terms of effectiveness and efficiency. Overall, the results evidence the outperformance of the OPF classifier in both types of criteria. Indeed, the OPF classifier was usually extremely fast, with average training/testing times much lower than those required by SVM-RBF and ANN-MLP. Moreover, when configured with Coiflets as feature extractors, the performance scores achieved by the OPF classifier include 89.2% as average accuracy and sensitivity/specificity values higher than 80% for all five classes.

Introduction

In the last few decades, a significant progress has been made in the broad area of biomedical signal processing (BSP), aiming at extracting relevant information directly from raw physiological data [1]. In particular, the automated classification of these data has shown up as a promising strategy for assisting physicians in identifying hard-to-diagnosis pathologies, such as epilepsy.

As mentioned by Chang and Lowenstein [2], the term epilepsy encompasses a number of different neurological syndromes characterized by a predisposition to recurrent unprovoked seizures. People has seizures when the electrical signals in the brain misfire. The brain׳s normal electrical activity is disrupted by these overactive electrical discharges, causing a temporary communication problem between nerve cells [3].

Arguably, the detailed analysis of electroencephalogram (EEG) is one of the most influential steps for the proper diagnosis of seizures and epilepsy [3]. Unfortunately, since the occurrence of an epileptic seizure cannot be predicted in advance in the majority of cases, continuous recording of EEG is quite common. However, analysis by visual inspection of long recordings of EEG is usually a time-consuming and error-prone process. Hence, the automatic detection of epilepsy directly from EEG data has been pursued by many researchers for a long time already.

Several works have investigated different artificial intelligence (AI) approaches for tackling epilepsy diagnosis via EEG signal classification. For instance, Nigam and Graupe [4] employed a multistage nonlinear filter in combination with a LAMSTAR neural network. The overall success percentage achieved by their system, considering both the false positive and false negative rates, was of 97.2%. In turn, Patnaik and Manyam [5] adopted the wavelet transformation for feature extraction, a genetic algorithm (GA) for choosing the training set, and an artificial neural network (ANN) trained with backpropagation for the classification of the signals. An average specificity of 99.19%, a sensitivity of 91.29% and a selectivity of 91.14% were obtained.

By other means, Subasi [6], [7], [8] adopted different versions of ANN and also mixture-of-expert (ME) models to discriminate between seizure and seizure-free profiles. Mixtures of ANN experts induced with wavelet coefficients have also been considered by Übeyli [9]. In this case, the total classification accuracy obtained by the ME network structures was 93.17%, and ROC curves for single multilayer perceptrons (MLP) and ME classifiers were provided. On the other hand, Güler and Übeyli [10] and Kannathal et al. [11] have both considered the application of neurofuzzy models as EEG classifiers. The main difference between these works lies in the type of feature extracted, via either wavelets or entropy measures, respectively. While the classification accuracy reported in [11] was typically above 90% for different entropy measures, that achieved in [10] was of 98.68% (with Daubechies of order 2 adopted as wavelet basis).

Tzallas et al. [12] presented a methodology that is based on time–frequency analysis. Initially, selected segments of the EEG signals (maybe with different sizes) are analyzed using time–frequency methods and several features are extracted for each segment, representing the energy distribution in the time–frequency plane. Then, those features are used as an input to a feedforward neural network, which provides the final classification. To evaluate the methodology, the authors have generated four different classification problems, and the results achieved in terms of overall accuracy varied from 97.72% to 100%. By other means, Kocyigit et al. [13] have designed an MLP classifier based on a faster variant of independent component analysis (ICA) feature extraction technique. The resulting system achieved a sensitivity rate of 98% and a specificity rate of 90.5%.

Recently, in a series of papers [14], [15], [16], our group has systematically evaluated the potentials of several kernel-based learning machines, such as support vector machines (SVM) and relevance vector machines, while tackling the task of automatic EEG signal classification. Overall, the results achieved evidence that all kernel machines considered were competitive in terms of accuracy and generalization, and the choice of the kernel function and its hyperparameter value as well as the choice of the feature extractor are really critical decisions to be taken into account.

All the aforementioned works report experiments conducted on the dataset made publicly available by Andrzejak et al. [17], which facilitates the comparison of the results achieved. This dataset has served well for benchmarking novel approaches for EEG signal classification due to its intrinsic difficulties. In total, it has five classes (two of which comprising normal patients with eyes open or closed, and the remaining comprising ill patients with different levels of epilepsy), whose full discrimination is very hard to achieve.

In this work, we also conduct a systematic empirical study on the problem of epilepsy diagnosis via EEG signal classification. However, we focus this time on another powerful classifier referred to as optimum path forest (OPF, for short) [18], [19]. This classifier has gained increased attention in the last few years for it has some advantages over more traditional classifiers: (i) it is free of hard-to-calibrate control parameters; (ii) it does not assume any shape/separability of the feature space; (iii) it runs the training phase usually much faster; and (iv) it can make decisions based on global criteria. Moreover, the OPF classifier does not interpret the classification task as a hyperplane optimization problem, but as the computation of optimum paths from some key patterns (known as prototypes) to the remaining nodes. By this means, each prototype becomes a root from its optimum path tree, and then each node is classified according to its strongly connected prototype. This process defines a discrete optimal partition (aka influence region) of the feature space. So, we argue that, due to its high efficiency and accuracy, jointly with its parameter independence and robustness to highly non-linear datasets, the OPF classifier can be considered as a very suitable alternative for automatically classifying EEG signals for epilepsy diagnosis.

However, although showing promising results in different application domains [20], [21], [22], [23], [24], [25], only recently the potentials of the OPF classifier have been investigated in the BSP context, more specifically in the problem of electrocardiogram-based arrythmia classification [26]. As far we are aware of, no work was conducted yet with respect to the task of epilepsy diagnosis. To assess the levels of performance delivered by the OPF classifier in this context, in terms of computational cost (efficiency) and accuracy/generalization rate (effectiveness), we have investigated the sensitivity of this classifier to the choice of the distance function used to calculate the similarity between the patterns [27], as well as to the type of features extracted from the EEG signal via the wavelet transform [9], [16], [28]. Moreover, a performance comparison with SVM classifiers configured with radial basis function kernel (SVM-RBF), MLP, and Bayesian classifiers was also realized.

The rest of the paper is organized as follows. In Section 2, we briefly outline the main aspects behind the wavelet families used as EEG feature extractors and the classifiers used in the experiments. In this section, more emphasis is given to the characterization of the OPF classifier. Section 3 is devoted to the assessment of the performance of the OPF classifier on the task of EEG signal classification, taking into account different distance functions, feature modalities, and also the behavior displayed by the other well-known classifiers. Finally, Section 4 concludes the paper.

Section snippets

Materials and methods

This section describes the two main steps involved in our classification methodology. First, we characterize the different wavelet families adopted as EEG feature extractors as well as the filter algorithms used for feature selection. Then, we detail the formalization and properties of the OPF classifier, and also briefly discuss the main features of the other learning algorithms adopted in the comparative assessment.

Results and discussion

Like the related works surveyed in the introduction of this paper, for assessing the performance of the OPF classifiers in the task of epilepsy diagnosis, we have employed the EEG data repository made publicly available by Andrzejak et al. [17]. The complete dataset consists of five sets (denoted as A–E), each containing 100 single-channel EEG segments of 23.6 s. Table 1 brings a descriptive summary of the five classes.

In some works [5], [6], [8], [14], only the sets A and E were used for

Concluding remarks

In this work, we have provided a thorough assessment of the performance of OPF classifiers when coping with the task of epilepsy diagnosis through EEG signal classification. For this purpose, four wavelet basis configured with different orders and three feature selectors were adopted for data preprocessing. When contrasted with traditional supervised learning algorithms, namely, SVM-RBF, Bayesian, and ANN-MLP, OPF classifiers have prevailed both in terms of efficiency (computational run times)

Acknowledgments

The first and last author thank National Council for Research and Development (CNPq) and Ceará Foundation for the Support of Scientific and Technological Development (FUNCAP) for providing financial support through a DCR grant #35.0053/2011.1 to UNIFOR. The second, third, and fourth authors also acknowledge the sponsorship from CNPq via grants #475406/2010-9, #304603/2012-0, 308816/2012-9, and #303182/2011-3. The fourth author is also grateful to FAPESP grant #2009/16206-1.

Thiago M. Nunes is graduated in Mechatronics Technology at the Federal Institute of Education, Science and Technology of Ceará (IFCE, 2009). Currently, he is an M.Sc. student in the Department of Teleinformatics Engineering at Federal University of Ceará (UFC). He is a collaborator of the Edmond and Lily Safra International Institute of Neuroscience of Natal (ELS-IINN), and also the Brazilian National Institute of Science and Technology/Brain-Machine Interface (INCT/INCEMAQ). His major fields

References (49)

  • R.G. Andrzejak et al.

    The epileptic process as nonlinear deterministic dynamics in a stochastic environmentan evaluation on mesial temporal lobe epilepsy

    Epilepsy Res.

    (2001)
  • J.P. Papa et al.

    Efficient supervised Optimum-Path Forest classification for large datasets

    Pattern Recognit.

    (2012)
  • A.I. Iliev et al.

    Spoken emotion recognition through Optimum-path Forest classification using glottal features

    Comput. Speech Lang.

    (2010)
  • J.P. Papa et al.

    Computer techniques towards the automatic characterization of graphite particles in metallographic images of industrial materials

    Expert Syst. Appl.

    (2013)
  • C.R. Pereira et al.

    An Optimum-Path Forest framework for intrusion detection in computer networks

    Eng. Appl. Artif. Intell.

    (2012)
  • E.J.S. Luz et al.

    ECG arrhythmia classification based on Optimum-Path Forest

    Expert Syst. Appl.

    (2013)
  • T. Gandhi et al.

    A comparative study of wavelet families for EEG signal classification

    Neurocomputing

    (2011)
  • H. Ocak

    Optimal classification of epileptic seizures in EEG using wavelet analysis and genetic algorithm

    Signal Process.

    (2008)
  • N.F. Güler et al.

    Recurrent neural networks employing Lyapunov exponents for EEG signals classification

    Expert Syst. Appl.

    (2005)
  • E.D. Übeyli

    Combined neural network model employing wavelet coefficients for EEG signals classification

    Digit. Signal Process.

    (2009)
  • T.M. Nunes et al.

    Automatic microstructural characterization and classification using artificial intelligence techniques on ultrasound signals

    Expert Syst. Appl.

    (2013)
  • R.B. Pachori et al.

    Analysis of normal and epileptic seizure EEG signals using empirical mode decomposition

    Comput. Methods Progr. Biomed.

    (2011)
  • K. Najarian et al.

    Biomedical Signal and Image Processing

    (2012)
  • B.S. Chang et al.

    Epilepsy

    New Engl. J. Med.

    (2003)
  • Cited by (91)

    • Postural evaluation based on body movement and mapping sensors

      2022, Measurement: Journal of the International Measurement Confederation
    • Theoretical background and related works

      2022, Optimum-Path Forest: Theory, Algorithms, and Applications
    • Hybrid and modified OPFs for intrusion detection systems and large-scale problems

      2022, Optimum-Path Forest: Theory, Algorithms, and Applications
    View all citing articles on Scopus

    Thiago M. Nunes is graduated in Mechatronics Technology at the Federal Institute of Education, Science and Technology of Ceará (IFCE, 2009). Currently, he is an M.Sc. student in the Department of Teleinformatics Engineering at Federal University of Ceará (UFC). He is a collaborator of the Edmond and Lily Safra International Institute of Neuroscience of Natal (ELS-IINN), and also the Brazilian National Institute of Science and Technology/Brain-Machine Interface (INCT/INCEMAQ). His major fields of interest are in Pattern Recognition, Artificial Intelligence, Biomedical Signal/Image Analysis and Processing.

    André L.V. Coelho earned the B.Sc. degree in Computer Engineering (1996), and the M.Sc. (1998) and Ph.D. (2004) degrees in Electrical Engineering, all from the State University of Campinas (Unicamp), SP, Brazil. He has a record of publications related to the themes of machine learning, data mining, computational intelligence, metaheuristics, and multiagent systems. He is a member of ACM and IEEE, and has served as a reviewer for a number of scientific conferences and journals. Currently, he is an adjunct professor affiliated with the Graduate Program in Applied Informatics at the University of Fortaleza (Unifor), Ceará, Brazil.

    Clodoaldo A.M. Lima received the B.Sc. degree in Electrical Engineering from the Federal University of Juiz de Fora (UFJF), Juiz de Fora, Brazil, in 1997, and the M.Sc. and Ph.D. degrees in Electrical Engineering from the University of Campinas (UNICAMP), Campinas, Brazil, in 2000 and 2005, respectively. From February 2005 to February 2006, he was a postdoctoral researcher at the same university. He has published dozens of articles and chapters on machine learning and related topics, in computer science journals and proceedings. Since 2010 he is a Doctor Professor in the University of São Paulo, SP, Brazil. The main topics of his research are computational intelligence, machine learning, statistical learning theory and signal processing.

    João P. Papa received his B.Sc. in Information Systems from the São Paulo State University, SP, Brazil. In 2005, he received his M.Sc. in Computer Science from the Federal University of São Carlos, SP, Brazil. In 2008, he received his Ph.D. in Computer Science from the University of Campinas, SP, Brazil. During 2008–2009, he had worked as a post-doctorate researcher at the same institute. He has been a Professor at the Computer Science Department, São Paulo, State University, since 2009, and his research interests include machine learning, pattern recognition and image processing.

    Victor Hugo C. de Albuquerque has a Ph.D. in Mechanical Engineering with emphasis on Materials from the Federal University of Paraíba (UFPB, 2010), an M.Sc. in Teleinformatics Engineering from the Federal University of Ceará (UFC, 2007), and he graduated in Mechatronics Technology at the Federal Center of Technological Education of Ceará (CEFETCE, 2006). He is currently an Assistant I Professor of the Graduate Program in Applied Informatics at the University of Fortaleza (UNIFOR), and is a collaborator professor of the Graduate Program in Neuroengineering of the Alberto Santos Dumont Association for Research Development (AASDAP) and the Edmond and Lily Safra International Institute of Neuroscience of Natal (ELS-IINN). He is also responsible for the partnership between the Control Systems Laboratory (LSC-UNIFOR) and the Brazilian National Institute of Science and Technology – Brain-Machine Interface (INCT-INCEMAQ). He has experience in Computer Systems, mainly in the research fields of: Applied Computing, Intelligent Systems, Visualization and Interaction, with specific interest in Pattern Recognition, Artificial Intelligence, Image Processing and Analysis, as well as Automation with respect to biological signal/image processing, image segmentation, biomedical circuits and human/brain–machine interaction, including Augmented and Virtual Reality Simulation Modeling for animals and humans. He is a co-author of more than 107 papers published in national and international journals and/or presented at conferences. Also he has been evolved in several research projects, both as a researcher and as a scientific coordinator. In addition, he has been a member of scientific and organizing committees of several international conferences, and reviewer of more than 18 international journals.

    View full text