Introduction
Sleep accounts for one-third of human life and is a critical link in human life. The quality of sleep affects many aspects of a person’s physical health, mental health, and memory [
1‐
3]. However, the task of sleep quality analysis is not only very demanding for physicians but also requires equipment with a high level of expertise. Polysomnography (PSG) is a powerful tool for sleep assessment that contains data such as electroencephalogram (EEG), electrooculogram (EOG), and electromyography (EMG). Physicians need to manually classify the collected PSG data records, which is a subjective and tedious process. Traditional machine learning methods based on manual extraction of statistical features usually include the following four steps: data preprocessing, feature extraction, feature selection, classification. Because of the need for strong professionalism to extract and select representative features, it is not friendly to researchers.
Step 1: Human body electrical signals are easily affected by other physiological electrical signals and the environment during the collection process, so preprocessing methods are needed to remove some noise effects. Methods such as multi-scale principal component analysis (PCA) [
4], or wavelet transform [
5], or notch filter and band-pass Butterworth filter [
6], etc.
Step 2: Extract feature information from the polysomnography map, such as maximum value, median value, entropy, and energy [
7].
Step 3: Use methods such as Best Subset Program (BSP) [
8], Minimum Redundancy Maximum Correlation Algorithm (MRMR) [
9], and recursive feature elimination algorithm based on support vector machine (SVM) [
10] to select the best feature subset.
Step 4: Use various machine learning classification methods to perform sleep stages on selected feature combinations, such as decision tree [
11], clustering [
12,
13], etc.
Table 1
Characteristic of each stage, adapted from AASM
Wake | beta wave(>13 Hz) | >50% alpha (8–13 Hz) activity, mixed (2–7 Hz) frequency activity |
N1 | “vertex sharp waves (5–14 Hz, >75uV)” | “50% of the epoch consists of relatively low voltage, mixed (2–7 Hz) activity, <50% alpha (8–13 Hz) activity ” |
N2 | — | high voltage, mixed (2–7 Hz) activity, K complexes, sleep spindles |
N3 | “slow wave activity (0.25–2 Hz)” | high voltage, low frequency activity |
REM | “sawtooth waves (2–6 Hz)” | relatively low voltage mixed (2–7 Hz) frequency EEG, alpha (8–13 Hz) activity |
Deep learning has gradually become the mainstream method in recent years, because it requires no domain knowledge and can implement end-to-end systems excellently. Some researchers have found that the sleep process has a certain transitional regularity [
14]. Based on this, it is common practice to enhance the central era information by taking advantage of the surrounding epochs. For instance, Tsinalis et al. [
15] combined one preceding and following epochs as common input and converted multiple one-dimensional features into two-dimensional features for learning by a stacking layer in Convolutional Neural Networks (CNN), thus realizing automatic sleep staging. Li et al. [
16] used a many-to-one strategy, took multi-epoch (3 epochs) raw EEG signals as its input, and relabeled the input. In addition, they set the threshold of softmax of their CCN-SE network according to the data distribution to alleviate the problem of class imbalance. Seo et al. [
17] designed a network with a modified ResNet-50 and a two-layered BiLSTM to capture intra- and inter-epoch representative features, and compared the impact of inputting one, four (3 past), and ten (9 past) epochs data on the classification results. Their experiments show that the more input data epochs, the better the results, which proves that there is indeed a certain correlation between sleep epochs. In [
18] and [
19], researchers built multi-task network architectures based on the theory that most adjacent epochs have the same label. For each input epoch, the categories of surrounding epochs were additionally calculated, and joint decision-making was carried out by assigning different weights. However, the above methods are all based on the research of the correlation between the various stages, ignoring the similarity of the internal features of each stage, as shown in the summary of Table
1.
For sleep staging tasks, just like PSG, the diversity of data types will affect the classification effect. Jia et al. [
20] input EEG, EOG, and EMG into an independent CNN branch network with multi-scale and residual connections (called SleepPrintNet), and perform feature fusion and classification. Amelia et al. [
21] used two PSG channel data (located at two different positions on the scalp), and after data preprocessing and data enhancement by overlapping windows in order, they were input into the CNN + LSTM network for classification. Xu et al. [
22] designed a lightweight convolutional neural network model to detect EEG fatigue status. They decomposed the five-channel EEG signal into multiple frequency bands and sent them to the convolutional network, respectively, and finally used the integrated learning method to weigh and vote on the network results. However, multimodal data often require the subject to wear more sensors. On the one hand, it has an impact on sleep itself; on the other hand, it is not conducive to the promotion of daily sleep monitoring.
To solve the above problems, this paper proposes a novel feature relearning method for automatic sleep staging based on single-channel EEG, which aims to reduce equipment requirements by using single-channel data and mining information within the stage. First, merge N1 and REM, which have strong similarities and a small amount of data [
23]. In the first part of the method, a novel stacked network is used for four classifications. In the second part of the method, the CNN block network is used to classify N1 and REM. The two parts are cascaded to get the final sleep staging result. The contributions of this paper are summarized as follows: (1) We develop a bottom–up and top–down network combined with the attention mechanism and use the cascading step with an imbalanced strategy, which can mine feature information within the stage. (2) We achieve automatic sleep staging tasks, and avoid any prior knowledge in the automated process, saving time, and manpower. (3) We only use single-channel EEG data, which reduces the requirements for equipment and facilitates the promotion of daily sleep monitoring applications.
The rest of this article is organized as follows. In Section ‘Methods’, we introduce the network structure and the framework of the proposed method. Section ‘Materials’ reflects the experiment and analysis in detail. Section ‘Results and discussion’ discusses results and visualization. Finally, Section ‘Conclusion and future work’ presents the conclusion and the future.
Conclusion and future work
In this paper, a novel feature relearning method is proposed. We design bottom–up and top–down model structures to effectively learn the features of each stage and use the cascading step to improve the classification performance further. The method implements an end-to-end staging, eliminating the need for professional knowledge. Only a single-channel EEG signal is used, which reduces the need for acquisition equipment, which shows that our method will be conducive to the practical application of sleep staging. The experimental results on the public dataset Sleep-EDF show that the proposed method is advanced. Although the complexity of the method has increased due to the hierarchical connection, the experimental results prove that the top–down connection achieves the complementation of information and comprehensively improves the classification performance to a certain extent.
In future work, we will explore a more concise model and conduct a more detailed study on the similarity of sleep stages, especially between N1 and other stages.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.