1 Introduction
In most production procedures in manufacturing, roller bearings need to be maintained in a healthful condition to guarantee the steadiness of production. Thus, it is essential to monitor the health condition of roller bearings to avoid machine breakdowns. Bearings may be categorised into two key types: (i) plain (sliding) bearings; and (ii) rolling bearings. Of these, rolling bearings are commonly used in most applications of rotating machinery [
1]. Vibration-based condition monitoring has been extensively studied and has become a well-accepted method for planned maintenance management as various typical features can be observed from vibration signals. In general, with these features, machine learning classifiers can be utilised to identify machine health conditions. However, the extracted features are typically distorted with noise and measurement errors that make it practically challenging to obtain distinguishable data that are well generalised. Therefore, considerable literature can be found around the theme of vibration signals feature extraction and feature selection for machine fault diagnoses.
It is now well established from a variety of studies, that vibration signal analysis can be performed in three main groups - time domain, frequency domain, and time-frequency domain. Various time domain-based techniques are used for vibration signal analyses. For instance, most of the time domain techniques are used to extract features from the raw vibration signals for bearings fault diagnoses using statistical functions as well as some other advanced functions [
2‐
11]. A considerable amount of literature has been published on the practice of frequency-domain techniques to extract various spectrum features from vibration signals that can efficiently represent a bearing’s health condition. These studies showed that the frequency domain analysis techniques can reveal information from vibration signals that are not easy to be observed in the time domain. For example, Fourier analysis including Fourier series, Discrete Fourier Transform (DFT), and Fast Fourier Transform (FFT) techniques are used to transform time-domain vibration signals to the frequency domain [
12‐
18]. Moreover, various techniques are used to extract different spectrum features to represent a bearing’s health condition. For instance, envelop analysis that is also called high-frequency resonance is evaluated for detecting incipient faults of bearings [
19]. Furthermore, various frequency domain features based on high-order spectra techniques are utilised to represent the bearing’s health condition [
20,
21].
The time-frequency domain-based methods such as short-time Fourier transform (STFT), wavelet transform (WT), Hilbert-Huang transform (HHT), local mean decomposition (LMD), empirical mode decomposition (EMD), which are introduced for nonstationary waveform signals, are used to extract features from vibration signals for bearing fault diagnosis [
22‐
32]. Several classification methods, such as logistic regression (LR), artificial neural networks (ANNs), and support vector machines (SVMs), can be utilised to classify different vibration signals based on the extracted features [
1]. In case the features are sensibly formulated, and the parameters of the classification methods are wisely tuned, it is possible to achieve high classification accuracy. Nevertheless, extracting useful features from such a huge and noisy vibration dataset, which may also contain measurement errors, is usually a challenging task. Recently, several lines of evidence suggest that feature-learning methods that can automatically learn representations of the vibration dataset can be a solution to address this challenge. Deep learning (DL) that usually learns representations of the data using a hierarchical multi-layer data processing architecture has been attracting a lot of interest. For example, Autoencoder-based Deep neural networks (DNNs) methods are used for bearings fault diagnosis in several studies [
33‐
39].
Moreover, the literature on the application of DL-based techniques for machine fault diagnosis has highlighted several studies describing the use of deep belief networks (DBNs) for bearings fault diagnosis [
40‐
44]. Furthermore, the application of recurrent neural networks (RNNs)-based techniques in bearings fault diagnosis was investigated by several researchers [
45‐
48]. In the same vein, several studies used convolution neural networks (CNNs)-based algorithms to process vibration signals for bearings fault diagnosis [
49‐
54]. Most of these studies applied pre-processing techniques such as FFT, WT, time-domain statistical functions, spectral kurtosis, to extract features from the raw vibration signals and used them as the input to the targeted DL technique, while others used the raw vibration signals directly as the input to the targeted DL technique. However, all the previously mentioned methods suffer from some serious limitations. For example (1) feature extraction from fault signals requires expert prior information and human labour; (2) it is sometimes hard to recognize faults features using only time-domain features, only frequency domain features, and only time-frequency domain features; and (3) the CNNs deep architecture was originally modeled for 2-D signals such as images and their application to 1-D signals such as vibrations was not straightforward.
Lately, researchers have shown increased interest in transforming the 1-D vibration signal into a 2-D image, which can often offer more discriminative descriptions of the vibration signals and allows direct usage of the CNN for fault diagnosis. For instance, Chong proposed a method for induction motors utilizing features of vibration signals in the two-dimension domain. In this method, the 2-D features of the vibration signal are achieved using the scale-invariant feature transform (SIFT) [
55]. In Ref. [
56] an ANN classifier with vibration spectrum imaging (VSI) is used for bearing fault classification where the vibration signal is first divided into time segments and adapted into an image. Then, the spectral contents of each image are computed and normalized to form a spectral image using FFT. Afterwards, to enhance features of the obtained spectral images, an average filter and binary threshold techniques are used to retain featured patterns and remove noise patterns. Finally, ANN is used as a fault classifier using these enhanced features of the faults.
Moreover, Kang and Kim presented a method for fault diagnosis of multiple induction motor faults using a 2-D representation of Shannon Wavelet. In this method, first, wavelet coefficients deduced from the Shannon wavelet function with dilation and translation parameters are used to create 2-D gray-level images. Then, the texture features of the created images are utilised as inputs to a multi-class support vector machine (SVM) classifier to identify faults in the induction machine [
57]. Li et al. [
58] presented a method for bearing fault using spectrum images of vibration signals. In this method, first, the FFT is used to obtain the spectrum images then each image is processed with 2-D principal component analysis (2DPCA) to reduce the dimensions. Finally, a minimum distance method is employed to classify bearing faults. Lu et al. [
59] proposed a fault diagnosis method for rotating machinery using image processing. In this method, first, the bi-spectrum technique is used to transform the vibration signal into a bi-spectrum contour map. Then, the speeded-up robust features (SURF) detector and descriptor technique is employed to extract automatically features from the transformed bi-spectrum contour map. After, the t-Distributed Stochastic Neighbor Embedding technique is used to reduce the dimensionality of the generated feature vectors. Finally, with these reduced features, the probabilistic neural network is used for fault identification. Verstraete et al. [
60] presented a method for rolling element bearing fault diagnosis using time-frequency representations and CNN. In this method, to validate the ability of the proposed CNN model to accurately diagnose bearings fault, three time-frequency techniques, i.e., STFT, WT, and HHT, are used to generate different representations of the raw signal. Then, these representations are separately fed into a CNN architecture for fault classification. The classification accuracy results of the three representations are compared to study their representation effectiveness.
Additionally, a vibration imaging and deep learning-based feature engineering technique for rotor systems fault diagnosis is proposed. In this technique, first, vibration signals are collected from sensors in the rotor systems then vibration images are prepared to be used as input to deep learning architecture. The vibration images are generated by first producing signals from virtual vibration sensors then the individual vibration signals are stacked to form the vibration images based on a phase synchronization rule. After, the vibration images are enhanced using the histogram of oriented gradients (HOG) descriptor technique. Then, the pretraining of the DBN is used to extract high-level features from the generated vibration images. Finally, a fault classifier that is based on fine-tuning the pre-trained DBN by combining it with a multilayer perceptron (MLP) is used for fault diagnosis [
61]. Zhang et al. [
62] presented a technique for bearing fault diagnosis using CNN with a 2-D representation of vibration signals. In this technique, first, the raw vibration signal is divided into
n equal parts and each part is aligned as the row of the 2-D image representation in sequence. Then, the obtained 2-D representations of the vibration signals are used as input to a CNN architecture for fault classification [
62]. Hasan and Kim [
63] proposed a method for bearing fault diagnosis under variable rotational speeds using Stockwell transform-based imaging and transfer learning techniques. In this method, the discrete orthonormal Stockwell transform (DOST)-based vibration imaging is used as a preprocessing step to generate health patterns. Then, a CNN-based transfer learning approach is used for fault diagnosis.
In 2019, Wang et al. [
64] proposed a fault recognition technique based on multi-sensor data fusion and bottleneck layer optimized CNN (MB-CNN). In this technique, the vibration signals of several sensors are fused in the feature maps. Then, MB-CNN is used to extract features and deal with the fault recognition of rotating machinery. Yan et al. [
65] proposed a fault diagnosis method for an active magnetic bearing-rotor system using vibration images. In this method, three features, histogram of vibration image (HVI), the histogram of oriented vibration image (HOVI), and 2-D FFT of vibration image (2DFFT) are designed based on signal amplitude, signal phase, and frequency domain, respectively. Then, a feature fusion technique called two-layer AdaBoost is introduced to train the fault recognition model. Moreover, Hoang and Kang presented a method for rolling element bearing fault diagnosis using CNN and vibration image. In this method, the amplitude of the sample in the vibration signal is normalized into the range [−1, 1] and then each normalized amplitude becomes the intensity of the corresponding pixel in the corresponding vibration image. After, with these vibration images, a CNN architecture is used for fault diagnosis [
66].
Furthermore, Zhu et al. [
67] proposed a method for bearing fault diagnosis using CNN based on a capsule network with an inception block (ICN). In this method, first, the raw data are changed from a one-dimensional signal to a two-dimensional graph using STFT. Then, the ICN, which is proposed to address the problem of poor generalization of CNN models in diagnosing bearing faults under different loads, is used to deal with the fault diagnosis. Ma et al. [
68] presented a bearing fault diagnosis method using 2-D image representation and transfer learning-based CNN (TLCNN). In this method, first, the time-domain raw signals are reconstructed to a fault signal component using the frequency slice wavelet transform (FSWT), then the reconstructed signals are converted into a 2-D time-frequency image. With these time-frequency images, the TLCNN model is used to extract features and achieve the classification conditions of bearing health. Zhu et al. [
69] proposed a method for rotor fault diagnosis using CNN with symmetrized dot pattern (SDP) images. In this method, vibration signals are transformed into SDP images then the graphical features of the SDP images of different vibration states are learned and identified using a CNN architecture.
Recently, Zhang et al. [
70] presented an enhanced CNN for bearing fault diagnosis method based on time-frequency image. In this method, the STFT is used to obtain the input images. Then, the obtained images are used as input to a CNN architecture with the scaled exponential linear unit (SELU). Kaplan et al. [
71] presented a feature extraction method for bearing fault diagnosis using texture analysis with local binary patterns (LBP). In this method, first, bearing vibration signals are converted to grayscale images. Then, the LBP technique is employed to obtain texture features. Finally, the obtained features are used as input to different classifiers such as K-nearest neighbor (K-NN), random forest (RF), Naive Bayes, Bayes Net, and ANN to deal with the classification problem.
Two important themes emerge from the studies discussed above: (1) Given its robust performance in image recognition, the CNN deep learning architecture has been used in most of the studies discussed above; and (2) taken together, the fault diagnosis results from these studies indicate that they may be further enhanced by considering two main factors: (a) how well the vibration images are generated from the vibration signals; and (b) how efficiently the generated vibration images reveal different patterns of each vibration signal status. On the question of image feature extraction techniques that can be applied to the generated vibration images, several types of features can be extracted from the generated vibration images such as texture, colour, shape, pixel intensity, etc. Hence, the more characteristics we have within the generated vibration images the more robust features can be extracted, and consequently, the more accurate learned classification model can be achieved.
This paper proposes a two-stage method for bearing fault diagnosis based on RGB vibration image representation and CNN (RGBVI-CNN). In the first stage, a technique for generating image representation from vibration signals that can successfully represent the bearing health condition is proposed. This technique uses image analytic techniques to generate efficient vibration image signals with rich characteristics using three main steps: (1) convert the 1-D vibration signals to 2-D gray-level vibration Images; (2) find the region of interest (ROI) in the binary images of the converted 2-D gray-level vibration images; and (3) generate RGBVIs based on connected components of the ROI of each vibration image that demonstrate useful characteristics of the targeted vibration signals. In the second stage, the RGBVIs with their texture and colour features are used as inputs to a CNN architecture to further learn useful features, which are used to obtain an accurate classification model for bearing faults.
This study aims to contribute to this growing area of research by exploring the efficacy of the two-stage RGBVI-CNN method in generating vibration images with useful characteristics that can distinguish different bearing health conditions using vibration signals. The contributions of this paper are summarized as follows:
1)
A new three-step approach of image analytics techniques is proposed in the first stage of the RGBVI-CNN, which produces 2-D RGBVIs with advantageous texture and colour features of bearing health conditions from the 1-D time-series vibration signals. The approach does not require any prior knowledge or any programmed parameters.
2)
The visualized texture and colour features of the RGBVIs that are generated in the first stage of the RGBVI-CNN method can visually offer discriminative patterns of bearing health conditions.
3)
A CNN-based deep learning architecture with three feature learning blocks is proposed to automatically learn features from the RGBVIs and to achieve improved classification accuracy for bearing health conditions.
The rest of this paper is structured as follows. Section
2 is dedicated to a description of the proposed method. Section
3 is devoted to a description of the performed experiments and datasets of the two case studies of bearing fault classification and the corresponding experimental results. Finally, Section
4 draws some conclusions from this study.