Elsevier

Journal of Neuroscience Methods

Volume 274, 1 December 2016, Pages 141-145
Journal of Neuroscience Methods

Short communication
Interpretable deep neural networks for single-trial EEG classification

https://doi.org/10.1016/j.jneumeth.2016.10.008Get rights and content

Highlights

  • First application of DNN with LRP on EEG data.

  • A novel tool for investigating neural activity.

  • Produces neurophysiologically highly plausible explanations in single trial.

  • Helps to diagnose influences that led to erroneous decisions.

  • May advance subject-independent zero training strategies in BCI.

Abstract

Background

In cognitive neuroscience the potential of deep neural networks (DNNs) for solving complex classification tasks is yet to be fully exploited. The most limiting factor is that DNNs as notorious ‘black boxes’ do not provide insight into neurophysiological phenomena underlying a decision. Layer-wise relevance propagation (LRP) has been introduced as a novel method to explain individual network decisions.

New method

We propose the application of DNNs with LRP for the first time for EEG data analysis. Through LRP the single-trial DNN decisions are transformed into heatmaps indicating each data point's relevance for the outcome of the decision.

Results

DNN achieves classification accuracies comparable to those of CSP-LDA. In subjects with low performance subject-to-subject transfer of trained DNNs can improve the results. The single-trial LRP heatmaps reveal neurophysiologically plausible patterns, resembling CSP-derived scalp maps. Critically, while CSP patterns represent class-wise aggregated information, LRP heatmaps pinpoint neural patterns to single time points in single trials.

Comparison with existing method(s)

We compare the classification performance of DNNs to that of linear CSP-LDA on two data sets related to motor-imaginary BCI.

Conclusion

We have demonstrated that DNN is a powerful non-linear tool for EEG analysis. With LRP a new quality of high-resolution assessment of neural activity can be reached. LRP is a potential remedy for the lack of interpretability of DNNs that has limited their utility in neuroscientific applications. The extreme specificity of the LRP-derived heatmaps opens up new avenues for investigating neural activity underlying complex perception or decision-related processes.

Introduction

Deep neural networks (DNNs) are powerful methods for solving complex classification tasks in fields such as computer vision (Krizhevsky et al., 2012), natural language processing (Socher et al., 2013), video analysis (Le et al., 2011) and physics (Montavon et al., 2013). Although researchers have recently started introducing this promising technology into the domain of cognitive neuroscience (Plis et al., 2014) and brain–computer interfacing (BCI) (Yuksel and Olmez, 2015, Yang et al., 2015), most of the current techniques in these fields are still based on linear methods (Parra et al., 2005, Blankertz et al., 2008). A limiting factor for the applicability of DNN in these fields is the notion of a DNN as a black box. In the domain of cognitive neuroscience this is a particular drawback because obtaining neurophysiological insights is of utmost importance beyond the classification performance of a system.

Recently, the interpretability aspect of deep neural networks has been addressed by the layer-wise relevance propagation (LRP) (Bach et al., 2015) method. LRP explains individual classification decisions of a DNN by decomposing its output in terms of input variables. It is a principled method which has a close relation to Taylor decomposition (Montavon et al., 2015) and is applicable to arbitrary DNN architectures. From a practitioner's perspective LRP adds a new dimension to the application of DNNs (e.g., in computer vision, Lapuschkin et al., 2016, Samek et al., 2015) by making the prediction transparent. Within the scope of cognitive neuroscience this means that DNN with LRP, may provide not only a highly effective (non-linear) classification technique that is suitable for complex high-dimensional data, but also yield detailed single-trial accounts of the distribution of decision-relevant information, a feature that is lacking in commonly applied DNN techniques and also in other state-of-the-art methods (such as those discussed below).

Here we propose using DNN with LRP for the first time for EEG analysis. For that we train a DNN to solve a classification task related to motor-imaginary BCI. On two example data sets we compare the classification performance of DNN to that of CSP-LDA, a standard technique (Blankertz et al., 2008). We then apply LRP to produce heatmaps that indicate the relevance of each data point of a spatio-temporal EEG epoch for the classifier's decision in single trial. We present several examples of such heatmaps and demonstrate their neurophysiological plausibility. Critically, we point out that the spatio-temporal heatmaps represent a new quality of explanatory resolution that allows to explain why the classifier reaches a certain decision in a single instance. Note that such information cannot be derived from CSP-LDA. Finally, we provide a range of future applications of this technique in neuroscience. We discuss why equipping the extremely powerful non-linear technology of DNN with the diagnostic power of LRP may contribute to extending the scope of DNN techniques.

Section snippets

Model details

The network applied here consists of two linear sum-pooling layers with bias-inputs, followed by an activation or normalization step each. The first linear layer accepts an input of the dimensionality number of time points in epoch × number of EEG channels (for subjects aa, …, ay 301 time point × 118 channels, for subjects od, …, obx, recorded in a different study with a different setup, 301 time point × 58 channels) vectorized to a 33,518 (od–obx: 17,458) dimensional input vector and produces a

Experimental setup and preprocessing

In order to gather a broader experience with DNN-LRP for EEG, the application of DNN with LRP on EEG data was demonstrated on two different data sets: (1) on dataset IVa from BCI competition III (cued motor imagery data with classes right hand vs. foot from 5 subjects, Blankertz et al., 2006) and (2) on a subset of 5 subjects from Brandl et al. (2015) where subjects had to perform left and right hand motor imaginary while dealing with different types of distractions. Here, we only analyzed data

Discussion

We have provided the first application of DNN with LRP on EEG data. In terms of classification performance, our relatively simple DNN network does not outperform the benchmark methodology of CSP-LDA. However, we provide some examples that training a network successively on several other subjects is advantageous. For instance, this substantially increased classification accuracy in a subject with particularly low accuracy. However, further investigations are required for reliable

Conclusion

In summary, we have provided a showcase of how LRP can add an explanatory layer to the highly effective technique of DNN in the EEG/BCI domain. Our results show that LRP provides highly detailed accounts of relevant information in high-dimensional EEG data that may be useful in analysis scenarios where single trials need to be considered individually.

Acknowledgements

This work was supported by the Brain Korea 21 Plus Program and by the Deutsche Forschungsgemeinschaft (DFG). This publication only reflects the authors’ views. Funding agencies are not liable for any use that may be made of the information contained herein.

References (24)

  • A. Krizhevsky et al.

    Imagenet classification with deep convolutional neural networks

  • S. Lapuschkin et al.

    Analyzing classifiers: Fisher vectors and deep neural networks

  • Cited by (335)

    View all citing articles on Scopus
    View full text