Introduction
Epileptic seizure focus
Cortical zones  Definition  Clinical diagnosis methods 

Epileptogenic zone (EGZ)  Area of cortex responsible for generating epileptic seizures; resection yields seizure freedom  Scalp EEG and iEEG 
Irritative zone  Area of cortex that generates interictal spikes  EEG and MEG 
Seizure onset zone (SOZ)  Area of cortex from which clinical seizures originate  SPECT, scalp EEG, and iEEG 
Epileptogenic lesion  Structural lesion that is related to the epilepsy  Highresolution MRI 
Ictal symtomatogenic zone  Area of cortex that generates the seizure symptoms or signs  Ictal video recording 
Functional deficit zone  Area of cortex that is not functioning normally in the interictal period  Neurological examination, Neuropsychological testing, Interictal PET and SPECT, Nonepileptiform EEG, and MEG 
Dataset name  No. of subjects  Electrode type  yEEG Type  Sampling frequency in Hz  Goal of the datasets 

Bonn (Andrzejak et al. 2001)  5  Singlechannel  Scalp EEG iEEG  173.61  Epileptic and nonepileptic patient detection 
Flint Hill (Osorio et al. 2001)  10  Multichannels  iEEG  240  Seizure detection 
Freiburg (Winterhalder et al. 2003)  21  Multichannels  iEEG  256  Seizure detection 
n CHBMIT\(^{1}\) (Shoeb and Guttag 2010)  23  Multichannels  Scalp EEG  256  Seizure detection 
Epilepsiae (Ihle et al. 2012)  275  Multichannels  Scalp EEG iEEG  2502500  Seizure detection 
TUSZ\(^{2}\) (Obeid and Picone 2016)  315  Multichannels  Scalp EEG  250  Seizure detection y 
BernBarcelona (Andrzejak et al. 2012)  5  Binarychannels  iEEG  512  Epileptic focus detection 
Public datasets

The Temple University Hospital (TUH) EEG Corpus (Harati et al. 2014), which is a large size and contains various subdatasets, including abnormality detection, seizure detection, and artifact classification.

The University of Bonn EEG Dataset (Andrzejak et al. 2001) contains several different classes of data that are recorded from healthy volunteers and patients.

Other common datasets include IEEG.org and the European Epilepsy Dataset.

It includes no information regarding locations of electrodes, which is essential for focal identification.

Signals are provided as independent segments without patient labels.

The highest frequency is limited to 150 Hz, even though recent neurological findings indicate that high frequency components (>100 Hz) are crucial to identify the epileptic focus.
Biomarkerbased approach
High frequency oscillation
Automated methods for HFOs detection
Phaseamplitude coupling (PAC)
Authors  Dataset  EEG type  Significant PAC range 

Guirgis et al. (2015)  7 patients with ETLE  iEEG  \( MI_{30450 Hz \& 0.54 Hz}\) 
Amiri et al. (2016)  25 consecutive epileptic patients (Montreal Neurological Institute and Hospital)  Scalp EEG  \( MI_{30260 Hz \& 0.313 Hz}\) 
Weiss et al. (2016)  12 patients with MTLE (UCLA Seizure Disorder Center)  iEEG  PAC between ripple amplitude and epileptiform spike phase 
Elahian et al. (2017)  10 patients with epilepsy (Le Bonheur Children’s Hospital)  ECoG  \( MI_{80150 Hz \& 430 Hz}\) 
Motoi et al. (2018)  123 patients with drugresistant focal epilepsy (Children’s Hospital of Michigan and Harper University Hospital in Detroit)  ECoG  \( MI_{150300 Hz \& 34 Hz}\) 
Varatharajah et al. (2018)  82 patients with focal epilepsy (Mayo Clinic, Rochester, MN)  iEEG  \( MI_{65115 Hz \& 0.130 Hz}\) 
Amiri et al. (2019)  18 patients with mTLE (Montreal Neurological Institute and Hospital)  iEEG  \( MI_{HFOs \& 48 Hz}\) 
Interictal epileptiform discharges (IEDs)
Information theoretic methods  Statistical methods 

Sample entropy (SE), permutation entropy (PE), delay permutation entropy (DPE), approximate entropy (APE), fuzzy entropy (FzE),y Reny’s entropy (REN), Shannon entropy (SE), Tsallis entropy (Ts), phase entropy (S1 and S2), wavelet entropy (WE), knearest neighbors entropy (kNNE), centered correentropy (CCE), Stein’s unbiased risk estimate entropy (SUREE), logenergy entropy (LEE), multivariate entropy (MVE)  Mean, variance (Var), standard deviation (SD), coefficient of variation, mean absolute value,modified mean absolute value (MMAV), MMAV2, fluctuation index, log detector median frequency (MDF), mean frequency (MNF), katz fractal dimension (KFD), fractal dimension (FD), skewness, kurtosis, different types of quartile: (Q1, Q3, interquartile range), largest lyapunov exponent (LLE), root mean square (RMS), band power (BP), zero crossing (ZC), Hjorth parameter: (activity, mobility, and complexity), teager energy, 1st and 2nd derivative: (mean, SD, var), recurrence qualitative analysis (RQA): mean diagonal line length (MDLL), laminarity (LAM), trapping time (TT), longest vertical line (LVL), longest diagonal line (LDL), recurrence times (RT), Kolmogorov Complexity (KC), LempelZiv complexity (LC) 
Statistical feature extraction
Authors  Features  Classifier  Evaluation 

Zhu et al. (2013)  Information theoretic features  SVM  ACC: 84% 
Sharma et al. (2014)  EMD+Information theoretic features  LSSVM  ACC: 85% 
Sharma et al. (2015a)  EMD+Information theoretic features  LSSVM  ACC: 87%; Sen: 90%; Spe: 84% 
Sharma et al. (2015b)  DWT+Information theoretic features  PNN, kNN, FSC, LSSVM  yACC: 84%; Sen: 84%; Spe: 84% 
Deivasigamani et al. (2016)  DTCWT+ Statistical methods  ANFIS  ACC: 99%; Sen: 98%; Spe: 100% 
Das and Bhuiyan (2016)  EMDDWT+ Information theoretic features  kNN  Acc: 89.40%; Sen: 90.70%; Spe: 88.10% 
Sharma et al. (2017)  Wavelet FB+Information theoretic features  SVM  ACC: 94.25%; Sen: 91.95%; Spe: 96.56% 
Sharma et al. (2017)  TQWT+Information theoretic, statistical features  SVM  ACC: 95%; 
Gupta et al. (2017)  FAWT+Information theoretic features  kNN, LSSVM  ACC: 94.41%; Sen: 93.25%; Spe: 95.57% 
Bhattacharyya et al. (2017)  TQWT+ Information theoretic features  LSSVM  Acc=84.67%; Sen=83.86%; Spe=85.46% 
Sriraam and Raghu (2017)  Information theoretic, statistical features  SVM  ACC: 92.15%; Sen: 94.56%; Spe: 89.74% 
Arunkumar et al. (2017)  Information theoretic features  NB, SVM, kNN, NNge, BFDT  ACC: 98%; Sen: 100%; Spe: 96% 
Itakura and Tanaka (2017)  BEMD+Information theoretic features  RBF SVM  ACC: 86.89% 
Chen et al. (2017)  DWT+ Statistical features  RBF SVM  ACC: 88% 
Bhattacharyya et al. (2018)  EWT+ Statistical features  LSSVM  yACC: 90%; Sen: 98%; Spe: 92% 
Acharya et al. (2019)  Statistical features  LSSVM  yACC: 87.93%; Sen: 89.97%; Spe: 85.89% 
Dalal et al. (2019)  FAWT+Statistical features  RELSTSVM  ACC: 90.2% 
Subasi et al. (2019)  EMD+DWT+WPD features  RF  ACC: 99.92% 
Gupta and Pachori (2020)  WT+Information theoretic features  LSSVM  ACC: 95.85%; Sen: 95.47%; Spe: 96.24% 
Sharma et al. (2020)  Statistical features  SVM  ACC: 99% 
Neural networks: endtoend approach
Structure of neural network

The convolutional layer consists of a set of learnable filters (or kernels), each of which has a small receptive field. Dot product (inner product) is performed between the filter weights and region in the input data. The output of the convolutional layer is called the feature map; its depth can be controlled by the number of filters. The stride is set to control how much the filter convolves across the input data.

The recurrent layer operates within the cyclical nature of data input and output; each output builds upon the one before it. The RNN with a tanh activation function can be defined as:where \(x_{t}\) is the input data of the time t, and \(h_{t}\) and \(h_{t1}\) are the hidden states of the time t and \(t1\), respectively. \(W_{x}\) and \(W_{h}\) are the learnable parameter matrices used for learning input data and hidden state, respectively. b is the parameter vector of the bias.$$\begin{aligned} h_{t} = \tanh (W_{x}x_{t} + W_{h}h_{t1} + b), \end{aligned}$$(4)

The pooling layer performs the downsampling operation for the input data, which can lower the calculation complexity and prevent overfitting. Some commonlyused pooling operations are max pooling and average pooling, both of which partition the input data into a subregion set. For each subregion, the output is either the maximum or the average value.

The fully connected layer is used to compute the class scores in the last layer. A onedimensional feature vector immediately precedes this layer and functions as its input. In a fullyconnected layer, each neuron is connected to all the numbers in the previous volume, which is identical to the traditional multilayer perceptron neural network.

The activation function is a nonlinear mathematical operation between the current neuron and its output to the next layer. The definitions of some of the commonlyused activation functions are as follows: \(\text{ Sigmoid } (x) = \frac{1}{1 + e^{x}}\), \(\tanh (x) = \frac{e^{x}  e^{x}}{e^{x} + e^{x}}\), and \(\text{ ReLU }(x) = \max (0, x)\), in each of these, x is the input variable.

The batch normalization layer is used for recentering and rescaling the input data, which stabilizes the neural networks by allowing for faster convergence. The calculation includes two steps. The first calculates the mean \(E(\cdot )\) and variance \(\text{ Var }(\cdot )\) of a batch data. In the second step, each sample is centered by subtracting the mean and dividing it by the standard deviation: \(y = \frac{x  E(x)}{\sqrt{\text{ Var }(x)}}\), where x is the input variable and y is the normalized result.
VGG Model Structure  

A  ALRN  B  C  D  E 
Input: 224*224 RGB Image  
Conv364  Conv364  Conv364  Conv364  Conv364  Conv364 
LRN  Conv364  Conv364  Conv364  Conv364  
Maxpool  
Conv3128  Conv3128  Conv3128  Conv3128  Conv3128  Conv3128 
Conv3128  Conv3128  Conv3128  Conv3128  
Maxpool  
Conv3256  Conv3256  Conv3256  Conv3256  Conv3256  Conv3256 
Conv3256  Conv3256  Conv3256  Conv3256  Conv3256  Conv3256 
Conv3256  Conv3256  Conv3256  
Conv3256  
Maxpool  
Conv3512  Conv3512  Conv3512  Conv3512  Conv3512  Conv3512 
Conv3512  Conv3512  Conv3512  Conv3512  Conv3512  Conv3512 
Conv3512  Conv3512  Conv3512  
Conv3512  
Maxpool  
Conv3512  Conv3512  Conv3512  Conv3512  Conv3512  Conv3512 
Conv3512  Conv3512  Conv3512  Conv3512  Conv3512  Conv3512 
Conv3512  Conv3512  Conv3512  
Conv3512  
Maxpool  
FC4096  
FC4096  
FC1000  
Softmax 
Methods based on neural networks
Authors  Feature  Classifier  Dataset  Accuracy 

Sui et al. (2019)  STFT  CNN  Barcelona  91.8 % 
Subathra et al. (2020)  FWHT  ANN  Barcelona  92.8 % 
Siddharth et al. (2019)  SMSSA  SAERBFN  Barcelona  99.11 % 
Zhao et al. (2018)  Entropy  CNN  Barcelona  83.0 % 
SanSegundo et al. (2019)  FT, WT & EMD  CNN  Barcelona  98.9 % 
Gagliano et al. (2019)  Bispectral  LSTM  iEEG.org  86.29 % 
Zhao et al. (2021)  Entropy & STFT  FCNN  Barcelona  93.44 % 
Daoud and Bayoumi (2019)  DCAE & MLP  Barcelona  93.21 %  
Li et al. (2019)  1DCNN  Barcelona  85.14 %  
Lu and Triesch (2019)  CNN  Barcelona  91.8 %  
Fraiwan and Alkhodari (2020)  Bidirectional LSTM  Barcelona  99.60 % 
Predicted positive  Predicted negative  

Actual positive  TP: True Positive  FN: False Negative 
Actual negative  FP: False Positive  TN: True Negative 
Evaluation criteria
Segmentwise criteria

Accuracy (ACC):$$\begin{aligned} Accuracy=\frac{TP+TN}{TP+FP+TN+FN}\times 100\%, \end{aligned}$$(5)

Sensitivity (SEN) or recall:$$\begin{aligned} SEN=\frac{TP}{TP+FN}\times 100, \end{aligned}$$(6)

Specificity (SPE):$$\begin{aligned} SPE=\frac{TN}{TN+FP}\times 100, \end{aligned}$$(7)

Precision or postitive predictive value (PPV):$$\begin{aligned} Precision=\frac{TP}{TP+FP}\times 100, \end{aligned}$$(8)

Fallout or false positive rate (FPR):$$\begin{aligned} FPR_{nfocal}=\frac{FP}{TN+FP}\times 100, \end{aligned}$$(9)

and F\(_{1}\) score, which is the harmonic mean of preision and sensitivity defined as:$$\begin{aligned} F_{1}score=\frac{2}{\frac{1}{Recall}+\frac{1}{precision}}, \end{aligned}$$(10)
Electrodewise evaluation criteria
Discussion and open problems

In our survey, recent studies that utilize engineering solutions to identify SOZ channels have shown promising results. Different ages and pathological types with an increased number of patients should be considered for future studies.

Most of the studies in this survey focused on developing patientdependent methods to improve computeraided systems, not on a patientindependent system (PID). For realworld applications, indeed, the patientindependent design (PID) is preferable because epileptologists require some EEG data to label focal and nonfocal electrodes used in the system to hypothesis the possible SOZ channels. However, the design of a patientindependent system for identifying SOZ channels is challenging due to the very different electrodes and subjectspecific nature of EEG signals. The most promising directions, those that allow for adaptation to different distributions, could be transfer learning and domain adaptation (Pan and Yang 2009; Lotte and Guan 2010; Azab et al. 2018).

For designing a supervised computeraided system (either patientdependent or patientindependent), the major limitation is the necessity to use SOZ as a priorbasis ground truth for the classifier training stage. Therefore, designing an unsupervised computeraided system for identification of the SOZ can provide a great facility without priorbasis information of ground truth to take a medical decision.

Data recording for clinical protocol depends on the patient’s conditions. In particular, it is challenging to collect enough data to apply to machine learning. Data augmentation is another hot topic in AI design, as it could be used to improve system usability and reduce the training set. Some attempts at data augmentation of focus detection have been reported (Akter et al. 2020a, b).

The selection of influential parameters is another critical factor to the design of a computeraided system. Parameters with more intelligent signal processing and featureextraction methods are required to further improve focus identification performance.

Statistical and informationtheoretic features in highfrequency components are promising in this application; however, the interpretation of these features in terms of clinical neurophysiology are still in question.