1 Introduction
2 Background
Study | General Information | |||
---|---|---|---|---|
Input | Dataset | Architecture | Accuracy Reached | |
Niyongabo, Zhang et al. (2022) | Time–frequency (CWT) | CWRU Bearing | CNN | 98.57% |
Han, Tian, et al. (2020) | Time signal, Time Frequency | CWRU Bearing | LeNet-5 | 93.7% |
Kumar, Anil, et al. (2020) | Time–Frequency | Own Dataset (Collected) | CNN | 96.8% |
Zhang, Ying, et al. (2020a) | Time signal, Time- Frequency | Own Dataset (Collected) | CNN | 97.81% |
Dhar, Priyadarshiny, et al. (2021) | Time–frequency | PCG Bearing database | AlexNet | 98.00% |
Cao, Guannan, et al. (2020) | Time–frequency | CWRU Bearing | GoogLeNet | 99.5% |
Neupane, Dhiraj, et al. (2020) | Time–frequency | CWRU Bearing | ResNet-50 | 99.68%—99.95% |
S. Zhang, F. Ye, et al. (2019b) | Time signal | CWRU Bearing | CNN | 98.06% |
Duan, Jian, et al. (2021) | Time–frequency | CWRU Bearing | CNN | 96.05% |
J. Zhang, Y. Sun, L. Guo, et al. (2020c) | Time–frequency | Own Dataset (Collected) | DFCNN | 99.80% |
Wen, Long, et al. (2019) | Time signal to RGB | CWRU Bearing | VGG-19 | 99.175% |
Grover, Chhaya, and Neelam Turk (2022) | Bispectrum images | CWRU Bearing | ResNet-50 | 99.85% |
Wei, Hao, et al. (2021) | Time–frequency (CWT) | CWRU Bearing | Residual Extreme Learning Machine (ResELM) | 99.90% |
Liu, Chenyu, et al. (2021) | Time signal | CWRU Bearing | SE-ResNet-26 | 99.30% |
Lee, Chun-Yao, and Truong-An Le. (2021) | Frequency (Persistence) | CWRU Bearing | ResNet structure (RCNN) | 99.625% |
Li, Mingyong, et al. (2019b) | Time signal | CWRU Bearing | Sequential CNN | 96%-98% |
Sun, Guodong, et al. (2021) | Time–frequency (MSSST) | CWRU Bearing | Faster Dictionary Learning | 99.08% |
Sun, Guodong, et al. (2020) | Time–frequency (STMSST) | CWRU Bearing | LeNet-5 | 99.83% |
Yuan, Laohu, et al. (2020) | Time–frequency (CWT) | CWRU Bearing | CNN-SVM | 98.75% |
Minervini, Marcello, et al. (2021) | Time–frequency (spectrogram) | CWRU Bearing | CommandNet | 91.3% |
Deveci, Çeltikoglu, et. al. (2021) | Time–frequency (spectrogram) | CWRU Bearing | ResNet-50 | 99.27% |
Zhang, Huang, et al. (2022) | Time–frequency (spectrogram) | Machine Fault Prevention Technology (MFPT) Dataset | CNN | 99.63% |
Yu, Y., et al. (2022) | Time–frequency (CWT, FFT) | MFPT Dataset | DASENet | 98.26% |
Yu, Y., et al. (2022) | Time–frequency (CWT, FFT) | MFPT Dataset | MIMTNer | 97.80% |
Yu, Y., et al. (2022) | Time–frequency (CWT, FFT) | MFPT Dataset | ResNet 18 | 96.68% |
Yu, Y., et al. (2022) | Time–frequency (CWT, FFT) | MFPT Dataset | TCNN (FDS) | 93.86% |
Yu, Y., et al. (2022) | Time–frequency (CWT, FFT) | MFPT Dataset | FTNN | 93.61% |
Yu, Y., et al. (2022) | Time–frequency (CWT, FFT) | MFPT Dataset | SVM (FDS) | 90.44% |
Wang, Zongyao, et al. (2022) | Time–frequency (CWT) | MFPT Dataset | GoogLeNet | 98.412% |
Wang, Zongyao, et al. (2022) | Time–frequency (CWT) | MFPT Dataset | ResNet50 | 92.318% |
Wang, Zongyao, et al. (2022) | Time–frequency (CWT) | MFPT Dataset | SqueezeNet | 97.51% |
Wang, Zongyao, et al. (2022) | Time–frequency (CWT) | MFPT Dataset | IGoogLeNet | 99.4% |
3 Overview of CNN Architectures
3.1 CNN Architectures
3.1.1 GoogLeNet
3.1.2 ResNet-50
Method | Top-5 Error (Test Results) |
---|---|
VGG (Simonyan & Zisserman, 2014) (ILSVRC’14) | 7.32 |
GoogLeNet (Szegedy et al., 2015) (ILSVRC’14) | 6.66 |
VGG (Simonyan & Zisserman, 2014) (v5) (ILSVRC’14) | 6.8 |
PreLU-Net (He et al., 2015)(ILSVRC’14) | 4.94 |
BN-inception (Ioffe et al., 2015) | 4.82 |
ResNet (ILSVRC’14) | 3.57 |
3.1.3 SqueezeNet
-
Layers breakdown (Gaikwad & El-Sharkawy, 2018)
-
layer 1: regular convolution layer
-
layer 2–9: fire module (squeeze + expand layer)
-
layer 10: regular convolution layer
-
layer 11: softmax layer
-
-
Architecture specifications (Gaikwad & El-Sharkawy, 2018)
-
gradually increase the number of filters per fire module
-
max-pooling with a stride of 2 after layers 1, 4, 8
-
average-pooling after layer 10
-
delayed downsampling with pooling layers
-
3.1.4 Inception ResNet-v2
3.2 Time–Frequency Methods
Normal Load 2 | Ball Fault 007 Load 0 | Ball Fault 014 Load 1 | |||
---|---|---|---|---|---|
Ball Fault 021 Load 3 | Inner Race Fault 007 Load 0 | Inner Race Fault 014 Load 1 | |||
Inner Race Fault 021 Load 3 | Outer Race Fault 007 Load 0 | ||||
Outer Race Fault 014 Load 1 | Outer Race Fault 021 Load 3 |
3.2.1 Spectrogram
3.2.2 Continuous Wavelet Transform (CWT)
3.2.3 Wavelet-based Synchrosqueezing Transform (WSST)
3.2.4 Fourier-based Synchrosqueezing Transform (FSST)
3.2.5 Wigner-Ville Distribution (WVD)
3.2.6 Constant-Q Nonstationary Gabor Transform (CQT)
3.2.7 Instantaneous Frequency
3.2.8 Hilbert Huang Transform (HHT)
-
Number of extrema is equal to the number of zero crossings or differs from it by not more than one.
-
Mean of envelopes (upper-lower) approach zero.
3.2.9 Scattergram
3.3 Frequency Methods
3.3.1 Power
3.3.2 Persistence
3.3.3 Spectral Kurtosis
3.3.4 Kurtogram
3.3.5 Spurious Free Dynamic Range (SFDR)
4 Datasets and Methodology Used
4.1 Case Western Reserve University Bearing Data
4.2 Machine Fault Prevention Technology Bearing Dataset
-
3 baseline (normal) condition signals on constant load of 270 lbs,
-
3 outer race fault condition signals on constant load of 270 lbs,
-
7 outer race fault condition signals with changing loads (25, 50, 100, 150, 200, 250 and 300 lbs),
-
7 inner race fault condition signals with changing loads (0, 50, 100, 150, 200, 250 and 300 lbs).
4.3 Methodology Used in this Study
Networks | SqueezeNet, GoogLeNet, ResNet-50 | Inception ResNet-v2 |
---|---|---|
Learning Rate | 1e-4 | 1e-4 |
Mini Batch Size | 64 | 32 |
Optimizer | Adam | Adam |
Validation Frequency | 50 | 50 |
Number of Epochs | 30 | 15 |
Number of Iterations | 1290 | 1305 |
5 Experimental Results
Method | Networks / Total Accuracy Reached (CWRU) | |||
---|---|---|---|---|
SqueezeNet | GoogLeNet | ResNet-50 | Inception Resnet-v2 | |
Spectrogram (Unthresholded) | 97.67% | 98.27% | 98.92% | 98.59% |
Spectrogram (Thresholded) | 96.00% | 97.25% | 98.00% | 96.13% |
Wavelet-based Synchrosqueezing Transform | 97.04% | 97.63% | 98.07% | 98.68% |
Fourier-based Synchrosqueezing Transform | 99.04% | 99.14% | 99.84% | 99.79% |
Wigner Ville Distribution | 96.21% | 96.46% | 98.22% | 98.74% |
Continuous Wavelet Transform | 98.93% | 99.16% | 99.66% | 99.34% |
Constant-Q Nonstationary Gabor Transform | 98.44% | 98.50% | 99.55% | 99.08% |
Hilbert Huang Transform | 88.91% | 90.91% | 93.41% | 92.16% |
Hilbert Huang Transform (Cropped) | 91.38% | 91.55% | 93.73% | 93.47% |
Scattergram Filter Bank 1 | 98.23% | 98.50% | 99.89% | 99.68% |
Scattergram Filter Bank 2 | 86.08% | 86.84% | 89.04% | 90.37% |
Instantaneous Frequency | 98.82% | 99.25% | 99.83% | 99.57% |
Kurtogram | 85.19% | 87.85% | 90.44% | 90.44% |
Kurtogram (Generalized) | 96.32% | 96.68% | 97.44% | 97.75% |
Power | 97% | 97.15% | 99.25% | 99.34% |
Persistence | 98.62% | 99.18% | 99.71% | 99.70% |
Spectral Kurtosis | 76.40% | 76.66% | 84.39% | 83.5% |
Spurious Free Dynamic Range | 95.53% | 95.52% | 97.15% | 98.37% |
Method | Networks / Total Accuracy Reached (MFPT) | |||
---|---|---|---|---|
SqueezeNet | GoogLeNet | ResNet-50 | Inception Resnet-v2 | |
Scattergram Filter Bank 1 | 99.44% | 99.13% | 99.92% | 100.00% |
6 Conclusion & Discussion
-
In every training phase on CWRU dataset, 400 images were used for each bearing condition (100 images for each load 0–1-2–3), i.e., a total of 4000 images. After the randomized train-validation split, only 2800 images were needed in the training phase, resulting in a shorter training time.
-
For the validation phase on CWRU dataset, 1600 images were used for each bearing condition (400 images for each load 0–1-2–3), summing to a total of ~ 16,000 images, with ~ 13,200 unseen images for each method. In conclusion, the train-validation split of proposed method was 17.5% for training and 82.5% for validation. The best accuracy achieved on this split was 99.89%, further proving the robustness of the method used.
-
The best performing time–frequency imaging method out of the methods that were benchmarked on CWRU dataset was tested on MFPT dataset, and 100% accuracy was achieved on Inception-ResNet-v2 CNN architecture. This shows that the methodology used in this study is replicable on vibration data that are sampled with high frequencies.
-
Unlike other approaches in the literature, the proposed method was trained with all the loads and on all fault sizes. The total sample size was 4–8 times greater than in most approaches. This made the proposed approach more versatile and less prone to overfitting.
-
In a real-world production environment, the load on the motor can change simultaneously, and controlling the load on a set value is hard to do. The proposed method achieves great accuracy regardless of the load that is on the DC motor, making this approach unique and usable in real-world industrial applications.
-
MFPT dataset contained 8 different load sizes on outer race fault conditions and seven different load sizes on inner race fault conditions. Although loads differ greatly on this dataset, 100% accuracy was achieved independently from the load sizes. Load-independent bearing fault classification was successfully achieved in this study.
-
Out of 72 validation accuracies on CWRU dataset, it is seen that 10 have 99.5% + accuracy, nine are between 99%-99.5% accuracy, and 25 are between 97%-99% accuracy. This statistic shows that most methods used in this study give extremely reliable results, similar to state-of-the-art studies.
-
Since SqueezeNet is a sequential and simple convolutional neural network with only 18 layers of depth, its performance is lower than that of other methods. The highest accuracy achieved with SqueezeNet on CWRU dataset is 99.04% when the input is Fourier-based Synchrosqueezing Transform images.
-
GoogLeNet, with more depth (22 layers) and inception modules, generally performs better than SqueezeNet, but worse than ResNet-50 and Inception-ResNet-v2. The reason for this might be the information loss between its layers. The best accuracy achieved with this CNN on CWRU dataset is 99.25% with Instantaneous Frequency image input.
-
Although not the deepest network in this study (50 layers), ResNet-50 obtains the best accuracy. Of 18 methods used on CWRU dataset, ResNet-50 gave the best performing result in 11. The best accuracy achieved on CWRU dataset in this study (99.89%, Scattergram FB2) used the ResNet-50 architecture.
-
Inception-ResNet-v2 is the deepest/most resource consuming network in this study, but not the best-performing network on CWRU dataset. It achieves the best result in six of the methods. The best result achieved with this CNN is 99.79%, with Fourier-based Synchrosqueezing Transform. On MFPT dataset, Inception-ResNet-v2 proved to be the most successful CNN architecture with 100% accuracy.
-
The average accuracies for each CNN show that Fourier-based Synchrosqueezing Transform gives the best result overall, with an average of 99.45% accuracy. This shows that FSST holds the best/clearest information of the 18 methods.
-
Thresholding the spectrograms on a certain dB value decreased the total accuracy caused by the information loss.
-
Cropping the HHT images to only include meaningful parts of the image increased the accuracy by ~ 1–3%, further showing the importance of cropping redundant areas on CNN training.
-
Because of its tree structure, the Kurtogram is more discrete than other methods in this study, thus, the generalization process was of utmost importance. Generalization of Kurtogram resulted in a dramatic improvement of 7–11%, showing the benefit of increasing the number of datapoints on discrete imaging methods.
-
Changing the filter bank on the scattergram changes the accuracy by as much as 9–12%, showing the significance of appropriate filter usage for the application.
-
There were no previous studies that made use of Scattergram with Filter Bank 1 on CWRU dataset. This method gave the best results out of 18 methods on CWRU dataset and it also achieved great results on MFPT dataset, strongly suggesting that scattergrams should be used more often in the future.
-
With 72 diverse training and validation phases, this study is one of the most comprehensive and detailed studies employing CWRU Bearing Data Center dataset.