Short communicationInterpretable deep neural networks for single-trial EEG classification
Introduction
Deep neural networks (DNNs) are powerful methods for solving complex classification tasks in fields such as computer vision (Krizhevsky et al., 2012), natural language processing (Socher et al., 2013), video analysis (Le et al., 2011) and physics (Montavon et al., 2013). Although researchers have recently started introducing this promising technology into the domain of cognitive neuroscience (Plis et al., 2014) and brain–computer interfacing (BCI) (Yuksel and Olmez, 2015, Yang et al., 2015), most of the current techniques in these fields are still based on linear methods (Parra et al., 2005, Blankertz et al., 2008). A limiting factor for the applicability of DNN in these fields is the notion of a DNN as a black box. In the domain of cognitive neuroscience this is a particular drawback because obtaining neurophysiological insights is of utmost importance beyond the classification performance of a system.
Recently, the interpretability aspect of deep neural networks has been addressed by the layer-wise relevance propagation (LRP) (Bach et al., 2015) method. LRP explains individual classification decisions of a DNN by decomposing its output in terms of input variables. It is a principled method which has a close relation to Taylor decomposition (Montavon et al., 2015) and is applicable to arbitrary DNN architectures. From a practitioner's perspective LRP adds a new dimension to the application of DNNs (e.g., in computer vision, Lapuschkin et al., 2016, Samek et al., 2015) by making the prediction transparent. Within the scope of cognitive neuroscience this means that DNN with LRP, may provide not only a highly effective (non-linear) classification technique that is suitable for complex high-dimensional data, but also yield detailed single-trial accounts of the distribution of decision-relevant information, a feature that is lacking in commonly applied DNN techniques and also in other state-of-the-art methods (such as those discussed below).
Here we propose using DNN with LRP for the first time for EEG analysis. For that we train a DNN to solve a classification task related to motor-imaginary BCI. On two example data sets we compare the classification performance of DNN to that of CSP-LDA, a standard technique (Blankertz et al., 2008). We then apply LRP to produce heatmaps that indicate the relevance of each data point of a spatio-temporal EEG epoch for the classifier's decision in single trial. We present several examples of such heatmaps and demonstrate their neurophysiological plausibility. Critically, we point out that the spatio-temporal heatmaps represent a new quality of explanatory resolution that allows to explain why the classifier reaches a certain decision in a single instance. Note that such information cannot be derived from CSP-LDA. Finally, we provide a range of future applications of this technique in neuroscience. We discuss why equipping the extremely powerful non-linear technology of DNN with the diagnostic power of LRP may contribute to extending the scope of DNN techniques.
Section snippets
Model details
The network applied here consists of two linear sum-pooling layers with bias-inputs, followed by an activation or normalization step each. The first linear layer accepts an input of the dimensionality number of time points in epoch × number of EEG channels (for subjects aa, …, ay 301 time point × 118 channels, for subjects od, …, obx, recorded in a different study with a different setup, 301 time point × 58 channels) vectorized to a 33,518 (od–obx: 17,458) dimensional input vector and produces a
Experimental setup and preprocessing
In order to gather a broader experience with DNN-LRP for EEG, the application of DNN with LRP on EEG data was demonstrated on two different data sets: (1) on dataset IVa from BCI competition III (cued motor imagery data with classes right hand vs. foot from 5 subjects, Blankertz et al., 2006) and (2) on a subset of 5 subjects from Brandl et al. (2015) where subjects had to perform left and right hand motor imaginary while dealing with different types of distractions. Here, we only analyzed data
Discussion
We have provided the first application of DNN with LRP on EEG data. In terms of classification performance, our relatively simple DNN network does not outperform the benchmark methodology of CSP-LDA. However, we provide some examples that training a network successively on several other subjects is advantageous. For instance, this substantially increased classification accuracy in a subject with particularly low accuracy. However, further investigations are required for reliable
Conclusion
In summary, we have provided a showcase of how LRP can add an explanatory layer to the highly effective technique of DNN in the EEG/BCI domain. Our results show that LRP provides highly detailed accounts of relevant information in high-dimensional EEG data that may be useful in analysis scenarios where single trials need to be considered individually.
Acknowledgements
This work was supported by the Brain Korea 21 Plus Program and by the Deutsche Forschungsgemeinschaft (DFG). This publication only reflects the authors’ views. Funding agencies are not liable for any use that may be made of the information contained herein.
References (24)
- et al.
Single-trial analysis and classification of ERP components – a tutorial
NeuroImage
(2011) - et al.
Subject-independent mental state classification in single trials
Neural Netw.
(2009) - et al.
Recipes for the linear analysis of EEG
NeuroImage
(2005) - et al.
The timing of exploratory decision-making revealed by single-trial topographic EEG analyses
NeuroImage
(2012) - et al.
EEG-based classification of video quality perception using steady state visual evoked potentials (SSVEPs)
J. Neural Eng.
(2015) - et al.
On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation
PLOS ONE
(2015) - et al.
The BCI competition III: validating alternative approaches to actual BCI problems
IEEE Trans. Neural Syst. Rehabil.
(2006) - et al.
Optimizing spatial filters for robust EEG single-trial analysis
IEEE Signal Proc. Mag.
(2008) - et al.
Bringing BCI into everyday life: motor imagery in a pseudo realistic environment
- et al.
Grand average ERP-image plotting and statistics: a method for comparing variability in event-related single-trial EEG activities across subjects and conditions
J. Neurosci. Methods
(2014)
Imagenet classification with deep convolutional neural networks
Analyzing classifiers: Fisher vectors and deep neural networks
Cited by (335)
Multi-branch spatial-temporal-spectral convolutional neural networks for multi-task motor imagery EEG classification
2024, Biomedical Signal Processing and ControlExplainable AI for time series via Virtual Inspection Layers
2024, Pattern RecognitionConvolutional neural networks reveal properties of reach-to-grasp encoding in posterior parietal cortex
2024, Computers in Biology and MedicineUnraveling motor imagery brain patterns using explainable artificial intelligence based on Shapley values
2024, Computer Methods and Programs in BiomedicineImageLM: Interpretable image-based learner modelling for classifying learners’ computational thinking
2024, Expert Systems with Applications