Skip to main content
Erschienen in: Neural Processing Letters 2/2024

Open Access 01.04.2024

Non-Uniformly Weighted Multisource Domain Adaptation Network For Fault Diagnosis Under Varying Working Conditions

verfasst von: Hongliang Zhang, Yuteng Zhang, Rui Wang, Haiyang Pan, Bin Chen

Erschienen in: Neural Processing Letters | Ausgabe 2/2024

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Most transfer learning-based fault diagnosis methods learn diagnostic information from the source domain to enhance performance in the target domain. However, in practical applications, usually there are multiple available source domains, and relying on diagnostic information from only a single source domain limits the transfer performance. To this end, a non-uniformly weighted multisource domain adaptation network is proposed to address the above challenge. In the proposed method, an intra-domain distribution alignment strategy is designed to eliminate multi-domain shifts and align each pair of source and target domains. Furthermore, a non-uniform weighting scheme is proposed for measuring the importance of different sources based on the similarity between the source and target domains. On this basis, a weighted multisource domain adversarial framework is designed to enhance multisource domain adaptation performance. Numerous experimental results on three datasets validate the effectiveness and superiority of the proposed method.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

The rapid development of the manufacturing industry is closely related to various machinery equipment. Modern mechanical equipment is increasingly automated and intelligent, which has high requirements for the reliability and stability of equipment components [1]. Bearings are used to support the mechanical rotating body and play a crucial role in the operation of the equipment. Therefore, accurately monitoring the condition of bearings can ensure the regular operation of equipment and avoid economic losses and safety risks caused by equipment damage [24].
In recent years, with the rise of artificial intelligence such as machine learning, data-driven intelligent fault diagnosis (IFD) methods have attracted considerable attention [5]. Among them, IFD methods based on deep learning (DL) are widely used due to their good adaptability and learning ability [6]. DL-based IFD methods usually hold the assumption that a large amount of labeled and equally distributed data are available [7, 8]. However, the above assumptions do not always hold in complex practical scenarios, resulting in serious degradation of diagnostic performance [9, 10].
To address the above challenges, an important tool of the transfer learning technique, domain adaptation (DA), is introduced to IFD [11, 12]. DA facilitates the utilization of transferable diagnostic information learned from the source domain in the target domain through domain confusion, thereby effectively enhancing diagnostic accuracy in the target domain [13]. A common DA approach is realized by statistical matching [14, 15]. For instance, Schwendemann et al. [16] introduced a layered maximum mean discrepancy and integrated it as a loss function within the deep neural network, effectively mitigating the distribution discrepancy between the source and target domains. Similarly, Xiong et al. [17] designed a central moment discrepancy to minimize the distribution discrepancy between domains for cross-domain fault diagnosis. Another common DA approach is domain adversarial training [18, 19]. Ganin et al. [20] introduced a domain adversarial neural network (DANN) to extract domain invariant features in the target and source domains. Wang et al. [21] designed a deep adversarial DA model to effectively mitigate the distribution discrepancy between the source and target domains. Although the aforementioned methods effectively mitigate domain shift, they are limited to learning transferable information solely from a single source domain. A single source of diagnostic information may ultimately limit the cross-domain diagnostic performance of the model.
Generally, multiple source domains often exist because the equipment operates in complex and variable operating environments. More abundant diagnostic information can be exploited in a multisource domain dataset, which can compensate for the lack of diagnostic knowledge in a single source domain [22]. Traditional DA methods that learn only single-source information cannot fully utilize the shared information among multiple source domains [23]. Fortunately, multisource DA (MDA) methods have been successfully explored to solve this problem. For example, a novel adversarial DA network with classifier alignment was designed by Zhang et al. [24] for multisource domain problems that may arise in real industrial scenarios. Shi et al. [25] achieved knowledge transfer from multiple sources based on a MDA network with an entropy penalty strategy. Multi-adversarial learning was employed by Zhu et al. [26] to extract high-dimensional domain-invariant features. Wang et al. [27] constructed a sub-domain adaptation network that enables multisource information transfer by integrating source-private feature extractors and classifiers. Shi et al. [28] proposed an instance-adaptive multisource transfer method that mitigates multisource domain shifts and improves the ability of the model to learn multisource diagnostic information.
Although the above methods achieve effective MDA through multi-domain matching, they ignore the distribution discrepancy between each pair of source and target domains, which restricts the accurate matching of source and target domains. Moreover, since source domain data are collected from different working conditions, their respective contributions to the target domain also exhibit discrepancies. The source domain close to the target working condition should be given a larger weight, which indicates that it contains more common diagnostic information about the working condition of the target domain. Literature [29, 30] demonstrate that more competitive classification performance can be achieved by weighting multiple source domains. An improved LMMD was proposed by Tian et al. [31] for computing source-specific weight scores and combining the weights with a multi-classifier to consider the different contributions of the sources. According to the similarity of different working conditions, Wei et al. [32] constructed a specific discriminator for each source in order to implement non-uniform weighting. The successful implementation of the above methods proved the necessity of considering source weights. However, their proposed weighting scheme relies on constructing additional network models and parameters, increasing the required computational cost. Therefore, a more efficient weighted MDA approach needs to be developed for multi-source knowledge transfer.
To this end, a non-uniformly weighted MDA network (NWMDAN) is developed in this study. The proposed method eliminates multi-domain shifts from both the inter-domain perspective between source and target domains and the intra-domain perspective for each pair of source and target domains. To measure the contribution of multiple source domains effectively, a non-uniform weighting scheme is designed based on the similarity between each source domain and the target domain. Furthermore, a non-uniformly weighted adversarial training framework is proposed to enhance MDA performance. The main contributions of this paper are as follows.
(1)
A more realistic fault diagnosis scenario, the multisource cross-domain fault diagnosis problem, is explored in this paper. To this end, a NWMDAN is proposed to enhance the MDA performance.
 
(2)
To effectively eliminate multi-domain shifts, an intra-domain distribution alignment strategy is designed to eliminate the intra-domain distribution discrepancy for each pair of source and target domain.
 
(3)
From the perspective of similarity between multiple source and target domains, a non-uniform weighting scheme is proposed for quantifying the contributions of different source domains.
 
(4)
A non-uniformly weighted adversarial training framework is proposed to learn and combine the multisource information better.
 
The remainder of this paper is structured as follows. The problem definition and relevant theoretical background are provided in Sect. 2. Section 3 provides the framework and diagnostic flow of the proposed NWMDAN method. The experimental validation and performance analysis are carried out in Sect. 4. Finally, the conclusions of this study are given in Sect. 5.

2 Preliminaries

2.1 Problem Definition

In this paper, the problem of multisource cross-domain fault diagnosis is investigated. It conforms to the following basic assumptions: (a) source domain data are collected from multiple working conditions; (b) target domain data are unlabeled; (c) the label space is shared by both the source and target domains.
Based on the above assumptions, the detailed symbolic definitions of the problem are given as follows. Assuming K source domains are available, and \(D_{k}^{s}=\{(x_{i}^{s,k},y_{i}^{s,k})\}_{i=1}^{{{n}_{s,k}}}(k=1,2,...,K)\) represents the k-th source domain dataset with \({{n}_{s,k}}\) labeled samples, where \(x_{i}^{s,k}\) and \(y_{i}^{s,k}\) denote the i-th sample and corresponding condition label from the k-th source domain, respectively. Let \({{D}^{t}}=\{x_{j}^{t}\}_{j=1}^{{{n}_{t}}}\) be the target domain dataset with \({{n}_{t}}\) unlabeled samples, where \(x_{j}^{t}\) denotes the j-th sample from the target domain. The purpose of this study is to build a reliable diagnosis model for multisource cross-domain tasks.

2.2 Maximum Mean Discrepancy

Maximum mean discrepancy (MMD) [32] is often used by statistics matching-based DA methods to calculate the distribution distance in different domains, which can measure the distribution discrepancies of different datasets [33, 34]. Given \(X\{{{x}_{i}}|{{x}_{i}}\sim p,i=1,2, \ldots , n\}\) and \(Y\{{{y}_{j}}|{{y}_{j}}\sim q,j=1,2, \ldots , m\}\), the MMD between X and Y is defined as follows:
$$\begin{aligned} D_{\mathcal {H}}^{{}}(X,Y)\triangleq ||{{E}_{X\sim \text {p}}[\Phi (x)]-{{E}_{Y\sim \text {q}}}[\Phi (y)]||_{\mathcal {H}}^{{}}} \end{aligned}$$
(1)
where \(\mathcal {H}\) represents the reproducing kernel Hilbert space (RKHS), and \(\phi (\cdot )\) denotes the feature mapping function. By introducing the characteristic kernel, the squared value of MMD can be expressed as follows:
$$\begin{aligned} MMD_{(X,Y)}^{2}=\frac{1}{{{n}^{2}}}\sum \limits _{i,j=1}^{n}{\mathcal {K}({{x}_{i}},{{x}_{j}})} -\frac{2}{mn}\sum \limits _{i=1}^{n}{\sum \limits _{j=1}^{m}{\mathcal {K}({{x}_{i}},{{y}_{j}})}} +\frac{1}{{{m}^{2}}}\sum \limits _{i,j=1}^{m}{\mathcal {K}({{y}_{i}},{{y}_{j}})} \end{aligned}$$
(2)
where \(\mathcal {K}(\cdot ,\cdot )\) represents the kernel function. If multiple kernels are available, the multi-kernel MMD (MK-MMD) can be used as a more effective metric to evaluate the differences in the different domains.

2.3 Domain Adversarial Neural Network

As an adversarial training-based DA method, domain adversarial neural network (DANN) [35] has attracted considerable attention in the field of IFD [36, 37]. DANN is composed of a classifier C, a feature extractor F and a domain discriminator D. The key to achieving domain adversarial training is the game process of F and D. D is trained to distinguish samples from the source or target domain accurately, and conversely, the goal of F is to confuse D as could as possible. In conclusion, the optimization objective of the DANN is defined as following equation:
$$\begin{aligned} {{L}_{DANN}}=\frac{1}{{{n}_{s}}}\sum \limits _{{{x}_{i}}\in {{D}_{s}}}{{{L}_{c}}(C}(F({{x}_{i}})),{{y}_{i}})-\frac{\lambda }{{{n}_{s}}+{{n}_{t}}}\sum \limits _{{{x}_{i}}\in ({{D}_{s}}\cup {{D}_{t}})}{{{L}_{d}}}(D(F({{x}_{i}})),{{d}_{i}}) \end{aligned}$$
(3)
where \({{L}_{c}}\) and \({{L}_{d}}\) represent the cross-entropy loss, \({{n}_{s}}\) and \({{n}_{t}}\) denote the number of samples in the \({{D}^{\text {s}}}\) and \({{D}^{\text {t}}}\), \({{y}_{i}}\) and \({{d}_{i}}\) denote the true condition label and domain label of i-th sample, respectively, and \(\lambda \) represents the trade-off parameter. To avoid a phased training process, the gradient reversal layer (GRL) is implemented into DANN to optimize the training process, which does not add additional parameters and enables the gradient sign to change when back propagating.

3 Proposed method

The structure of the proposed NWMDAN is shown in Fig. 1. It consists of a feature extractor, an intra-domain distribution alignment strategy, a non-uniform weighting scheme, and a non-uniformly weighted adversarial training framework. Specifically, to clearly represent the data related to different sources, the corresponding colors are used in Fig. 1. For example, the red color indicates data regarding the first source domain. The structure and parameters of the proposed method are shown in Fig. 2. The Conv, BN, and FC shown in Fig. 2 denote the 1D convolutional layers, the batch normalization layers, and the fully connected layers. The three parameters of the Conv layer, for example, Conv (16,15,1), denote the number of input channels, the convolution kernel size and the stride, respectively. The parameters that follow the FC layer represent their output size. The detailed algorithm flowchart is shown in Algorithm 1.

3.1 Feature Extractor

In the field of IFD, convolutional neural networks (CNN) have been popularly adopted as feature extractors due to their simplicity of training and superior performance [3840]. Based on this, a CNN with four convolutional layers and a fully connected layer is used as the feature extractor for the proposed method. In particular, batch normalization and dropout are used in the feature extractor to speed up model training and prevent overfitting.

3.2 Intra-Domain Distribution Alignment Strategy

The domain adversarial process described in Sect. 2.3 is able to map the source and target domain data to identical feature space and eliminate distribution differences between them. However, in multisource domain problems, it is challenging to directly eliminate multi-domain shifts by globally matching the multisource domain data with the target domain data. Because there are not only distribution discrepancies between the source and target domains but also distribution discrepancies observed among different source domains. Therefore, an intra-domain distribution alignment strategy is designed to align each source domain with the target domain. Specifically, in each iteration of the epoch, all samples from each source domain are selected from the data in each batch_size, and the distribution discrepancy between each source domain and the samples from the target domain in the batch_size is calculated. The proposed intra-domain distribution alignment strategy measures the distribution discrepancy in the two domains using MK-MMD with five Gaussian kernels. The square value of MK-MMD between the k-th source domain and the target domain can be calculated as follows:
$$\begin{aligned} MM{D^2}_{({f^{_{^{s,k}}}},{f^t})} =&\displaystyle {\frac{1}{{n_{s,k}^2}}}&\!\!\!\!\sum \limits _{i = 1}^{{n_{s,k}}} {\sum \limits _{j = 1}^{{n_{s,k}}} K } (f_i^{s,k},f_j^{s,k})\nonumber \\ \!\!\!\!- & {} \!\!\!\! \displaystyle {\frac{2}{{{n_{s,k}}{n_t}}}}\sum \limits _{i = 1}^{{n_{s,k}}} {\sum \limits _{j = 1}^{{n_t}} K } (f_i^{s,k},f_j^t)\nonumber \\ \!\!\!\!+ & {} \!\!\!\! \displaystyle {\frac{1}{{n_t^2}}}\sum \limits _{i = 1}^{{n_t}} {\sum \limits _{j = 1}^{{n_t}} K } (f_i^t,f_j^t) \end{aligned}$$
(4)
where \(f_{i}^{s,k}\) represents the features of the i-th sample from \(D_{k}^{s}\), and \(f_{j}^{t}\) represents the feature of the j-th sample from \({{D}^{t}}\). When the MK-MMD of all K sources and target domains are calculated, the intra-domain distribution alignment loss can be obtained as follows:
$$\begin{aligned} {L_{\mathrm{{intra}}}} = \frac{1}{K}\sum \limits _{k = 1}^K {MMD_{({f^{s,k}},{f^t})}^2}. \end{aligned}$$
(5)

3.3 Non-Uniform Weighting Scheme

To measure the contribution of different source domains, a non-uniform weighting scheme is designed based on the statistical distance between source domains and target domains. Specifically, the MK-MMD distance is utilized in each epoch to calculate distinct source-specific weights. The similarity of domains calculated according to statistical distances has been demonstrated to be a reliable metric for the contribution of different source domains [29]. Consequently, the source domain that is closer to the target domain distribution should be assigned a larger weight. Based on the intra-domain distribution alignment loss accumulating the MMD values within the whole epoch, the weight of k-th source domain in the next epoch is formulated as follows:
$$\begin{aligned} {{\omega }_{s,k}}=\frac{\exp (-\eta MMD_{({{f}^{s,k}},{{f}^{t}})}^{2})}{\sum \nolimits _{k=1}^{K}{\exp (-\eta MMD_{({{f}^{s,k}},{{f}^{t}})}^{2})}} \end{aligned}$$
(6)
where \(\eta \) represents the hardness coefficient, it enables a more significant difference in the contributions of different sources; \(f_{{}}^{s,k}\) represents the sample features of \(D_{k}^{s}\).

3.4 Non-Uniformly Weighted Adversarial Training Framework

3.4.1 Non-Uniformly Weighted Discriminator

The domain discriminator is the key player in domain adversarial training. The discriminator of traditional DA methods receives the feature representations of the source and target domains and outputs a probability vector representing the domain labels of the samples. Besides the feature representations of the samples, in NWMDAN, the source-specific weights are also used as input elements to the domain discriminator, which makes the proposed method more effective in extracting diagnostic information. Therefore, the weighted domain discrimination loss \({{L}_{d}}\) of the weighted discriminator is defined as follows:
$$\begin{aligned} {{L}_{d}}=\frac{1}{K}\sum \limits _{k=1}^{K}{\frac{{{\omega }_{s,k}}}{{{n}_{s,k}}}}\sum \limits _{(x_{i}^{s,k},y_{i}^{s,k})\in D_{k}^{s}}{{{L}_{d}}(D(F(x_{i}^{s,k}),d_{i}^{s,k})}+\frac{1}{{{n}_{t}}}\sum \limits _{x_{_{i}}^{t}\in {{D}^{t}}}{{{L}_{d}}(D(F(x_{i}^{t}),d_{i}^{t})} \end{aligned}$$
(7)
where \(d_{i}^{s,k}\) and \(d_{i}^{t}\) are the domain label of \(x_{i}^{s,k}\) and \(x_{i}^{t}\), respectively.

3.4.2 Non-Uniformly Weighted Classifier

In adversarial training, the goal of the classifier is to identify the health status of samples as accurately as possible while ensuring that sample features remain domain-invariant. Considering that there are differences in the transferability of samples from different source domains, source-specific weights are integrated into the NWMDAN classifier. The weighted classification loss \({{L}_{c}}\) of the weighted classifier can be formulated as follows:
$$\begin{aligned} {L_c} = \frac{1}{K}\sum \limits _{k = 1}^K {\frac{{{\omega _{s,k}}}}{{{n_{s,k}}}}} \sum \limits _{(x_i^{s,k},y_i^{s,k}) \in D_k^s} {{L_c}} (C(F(x_i^{s,k})),y_i^{s,k}). \end{aligned}$$
(8)

3.5 Optimization Objective

The optimization objective contains intra-domain alignment loss \({{L}_{\text {intra}}}\), domain adversarial loss \({{L}_{d}}\) and classification loss \({{L}_{c}}\), which is expressed as Eq. 9:
$$\begin{aligned} L = {L_c} + \lambda {L_{\mathrm{{intra}}}} - \beta {L_d} \end{aligned}$$
(9)
where \(\lambda \) and \(\beta \) are trade-off parameters to balance the loss terms. By minimizing \({{L}_{c}}\), samples of different fault categories can be predicted accurately, which is beneficial for extracting domain discriminative features. The distribution discrepancy in each pair of source and target domain is reduced by minimizing \({{L}_{\text {intra}}}\). Besides, maximizing \({{L}_{d}}\) enables to align multiple source domains to the target domain. Let \({{\theta }_{f}}\), \({{\theta }_{c}}\), \({{\theta }_{d}}\) denote the parameters of feature generator, classifier and domain discriminator, respectively, and they can be updated as follows:
$$\begin{aligned}{} & {} {{\theta }_{f}}\leftarrow {{\theta }_{f}}-\alpha (\displaystyle {\frac{\partial {{L}_{c}}}{\partial {{\theta }_{f}}}}+\frac{\lambda \partial {{L}_{\text {intra}}}}{\partial {{\theta }_{f}}}-\frac{\beta \partial {{L}_{d}}}{\partial {{\theta }_{f}}}) \nonumber \\{} & {} {{\theta }_{c}}\leftarrow {{\theta }_{c}}-\alpha \displaystyle {\frac{\partial {{L}_{c}}}{\partial {{\theta }_{c}}}} \nonumber \\{} & {} {{\theta }_{d}}\leftarrow {{\theta }_{d}}-\alpha \displaystyle {\frac{\partial {{L}_{d}}}{\partial {{\theta }_{d}}}} \end{aligned}$$
(10)
where \(\alpha \) is the learning rate. The Eq. 10 will be updated simultaneously in a training epoch by GRL, which avoids a staged training process.

4 Case Study

4.1 Dataset Description

(1)
Jiangnan University (JNU) bearing dataset: The JNU dataset is a famous open-source bearing dataset that is widely used as a benchmark to verify the feasibility of diagnostic methods. Four health statuses can be found in the JNU dataset, including normal state (Normal), inner race fault (IRF), outer race fault (ORF) and ball fault (BF). The vibration signals are collected at 50 kHz with three speeds (600, 800 and 1000 r/min).
 
(2)
Self-made bearing test rig dataset-1(Case1): Case1 bearing vibration signals are collected from an experimental rig shown in Fig. 3(a). The experimental system is composed of an inverter motor, a main shaft, a supporting device, and the SKF 6206-2RS1/C3 test bearings. Three sensors distributed at different locations in space are used to collect vibration signals in multiple directions. Similar to the JNU dataset, the Case1 dataset also includes four state types (Normal, ORF, IRF and BF), and each fault has different fault sizes. Figure 3(b)–(d) show the different health conditions of the tested bearings. In addition, four speeds, including 150, 300, 900 and 1500 r/min, are applied to the bearings by adjusting the driving device.
 
(3)
Self-made bearing test rig dataset-2(Case2): The Case2 bearing data are collected on a self-made experimental rig shown in Fig. 4, where, besides the ORF, IRF, and BF failure types, bearing retainer fracture failures (RF) are also simulated. The vibration signals of three loads, 0N, 500N and 1000N, at the speed of 300r/min are selected as experimental data for this experiment to further verify the effectiveness of the proposed method. The detailed descriptions of datasets are given in Table 1.
 
Table 1
Detailed description of the three datasets
Dataset
Condition
Fault size (mm)
Label
Working condition
JNU
Normal
0
600/800/1000 r/min
 
IRF
1
 
 
ORF
2
 
 
BF
3
 
Case1
Normal
0
150/300/900/1500 r/min
 
BF
0.4
1
 
 
BF
0.2
2
 
 
IRF
0.3
3
 
 
IRF
0.4
4
 
 
ORF
0.2
5
 
 
ORF
0.3
6
 
Case2
Normal
0
0/500/1000 N
 
RF
1
 
 
BF
2
 
 
IRF
0.3
3
 
 
IRF
0.5
4
 
 
ORF
0.2
5
 
 
ORF
0.7
6
 

4.2 Comparison Methods

Several commonly used DA methods are chosen as comparison methods to validate the effectiveness and performance of the proposed network comprehensively. Specifically, the selected comparison methods have the same network structure and parameters as our proposed method. According to different states of the source domain data implemented in the comparison methods, the comparison experiments can be divided into the following three groups.
(1)
Single source domain-based transfer methods (Single best). CNN, deep adaptation network (DAN) [41], deep coral (DCORAL) [42], joint adaptation network (JAN) [43], and DANN with a single-source domain. For this group of comparison methods, we select different source domains successively to construct transfer tasks and record the maximum diagnostic accuracy obtained using different source domains as the final result. Among them, the CNN model whose feature extractor and classifier have the same structure as the proposed method, is chosen as the benchmark for the comparison methods. The most widely used statistics matching-based DA methods, DAN and DCORAL, are chosen as comparison methods, which calculate the distribution discrepancy by MK-MMD and CORAL, respectively. Additionally, a JAN based on the joint maximum mean discrepancy is also used in the comparison experiments. Then, the adversarial learning-based DANN is chosen as one of the comparison methods, which is the original framework of our approach.
 
(2)
Merged source domain-based transfer methods (Merged source domains). CNN, DAN, JAN, DCORAL, and DANN with merged source domains. Such group comparison methods provide improvements in source domain data compared with diagnosis methods based on the single source. Merging fault samples collected under different working conditions into the same source domain makes it possible to extract more diagnostic information.
 
(3)
Multi-source domain transfer methods (Multisource domain). NWMDAN_a, NWMDAN_b, MDA network (MDAN) [44] with multisource domain. As one of the most effective MDA approaches, MDAN is employed as a comparison method. NWMDAN_a and NWMDAN_b, variants of NWMDAN, serve as comparison methods to validate the effectiveness of the proposed intra-domain distribution alignment strategy and non-uniformly weighted adversarial training framework. In NWMDAN_a, the intra-domain distribution alignment strategy is removed, while in NWMDAN_b, the non-uniform weights are omitted. However, both NWMDAN_a and NWMDAN_b retain the same structure and parameters as NWMDAN.
 
The utilized CNN model comprises a feature extractor and a classifier, whose structures are consistent with the corresponding components in the proposed method. Specifically, DAN, JAN, and DCORAL augment the CNN architecture by incorporating MK-MMD, joint MMD, and CORAL distance metrics respectively to address distribution discrepancies. In the DANN model, the feature extractor and classifier align with the proposed method, while the discriminator functions as a binary classifier. Extending the DANN, the MDAN model enhances the binary discriminator to enable the recognition of multiple domains. The network architectures of NWMDAN_a and NWMDAN_b align with the proposed method, with the distinction that the intra-domain distribution alignment strategy and non-uniform weighting scheme have been respectively omitted.

4.3 Experimental Details

The vibration data for different rotational speeds in the JNU and Case1 datasets are regarded as different domains, respectively, as shown in Table 2. The JNU dataset contains 2930 samples in each domain, which includes 1466 samples of healthy bearings and the remaining are samples of various faults. For the Case1 dataset, 1134 samples are available in each domain, which consists of seven categories with equal number of samples. The Case2 dataset contains 195 samples for each fault state. By default, the samples in the three datasets contain 1024 sampling points. For each task, 80\(\%\) of samples from source and target domains are randomly selected as training samples, and the remaining 20\(\%\) samples are used for testing. The details of all the transfer tasks on both datasets are shown in Table 3.
Table 2
Detailed description of all diagnostic tasks
Dataset
Working condition
Domain name
JNU
Rotating speed:600 r/min
A
 
Rotating speed:800 r/min
B
 
Rotating speed:1000 r/min
C
Case1
Rotating speed:150 r/min
D
 
Rotating speed:300 r/min
E
 
Rotating speed:900 r/min
F
 
Rotating speed:1500 r/min
G
Case2
Load:0 N
H
 
Load:500 N
I
 
Load:1000 N
J
Table 3
Detailed description of all diagnostic tasks
Dataset
Transfer task
Source domain
Target domain
JNU
A+B\(\rightarrow \)C
A+B
C
 
A+C\(\rightarrow \)B
A+C
B
 
B+C\(\rightarrow \)A
B+C
A
Case1
D+E+F\(\rightarrow \)G
D+E+F
G
 
D+E+G\(\rightarrow \)F
D+E+G
F
 
D+F+G\(\rightarrow \)E
D+F+G
E
 
E+F+G\(\rightarrow \)D
E+F+G
D
Case2
H+I\(\rightarrow \)J
H+I
J
 
H+J\(\rightarrow \)I
H+J
I
 
I+J\(\rightarrow \)H
I+J
H
The trade-off parameters \(\lambda \) and \(\beta \) of NWMDAN, which correspond to domain alignment loss and discrimination loss in the optimization objective, respectively, may have a significant impact on performance. Therefore, the grid search strategy is employed to investigate the sensitivity of parameters \(\lambda \) and \(\beta \) on the network performance, and the experimental results in tasks B,C\(\rightarrow \)A are illustrated in Fig. 5. According to Fig. 5, the highest diagnostic accuracy is achieved at \(\lambda =0.8\) and \(\beta =0.6\), and the performance of NWMDAN degrades significantly when the parameter values close to 1 simultaneously.
Furthermore, the sensitivity of the diagnostic performance with respect to the hardness coefficient \(\eta \) is investigated on three tasks and the obtained results are shown in Fig. 6. The experimental results demonstrate that the proposed method achieves the highest diagnostic accuracy when \(\eta =20\). When the value of \(\eta \) is close to 100, the accuracy decreases on all diagnostic tasks, and the performance degradation is more obvious in some of the tasks with more challenging knowledge transfer, for example B, C\(\rightarrow \)A. The detailed setting of the remaining parameters are given in Table 4. In particular, a fixed-step update strategy is used to adjust the learning rate, which decreases to 1e-4 and 1e-5 after 150 and 250 epochs, respectively.
Table 4
Detailed parameter settings for the experiments
Parameters
Value
Parameters
Value
Maximum epoch
300
Weight decay
1e-5
Batch size
64
Trade-off parameters \(\lambda ,\beta \)
0.8,0.6
Optimizer
Adam
Learning rate
1e-3
Finally, the experimental accuracies of the test samples in the target domain are chosen as the model evaluation metrics. Each experiment is performed five times repeatedly to avoid the effect of randomness, and the averaged value of five experiments is regarded as the final diagnosis accuracy. All experiments are implemented through the Pytorch framework on a experimental equipment with Intel Core i5-11400K, NVIDIA GeForce RTX 3060.

4.4 Experimental Results and Analysis on JNU Case

4.4.1 Comparison of Classification Accuracy

The proposed method and the comparison methods discussed in the previous section are applied to the tasks in the JNU case, and the detailed diagnostic results are presented in Table 5. It is clear that more competitive diagnostic accuracy of NWMDAN is observed in all diagnostic tasks with an average diagnostic accuracy 99.16\(\%\), indicating its effectiveness in multisource cross-domain diagnostic tasks. By further analysis, the following conclusions can be found.
Table 5
Classification accuracy (%) on the JNU case with different methods
Groups
Methods
Transfer tasks
Average
  
A+B\(\rightarrow \)C
A+C\(\rightarrow \)B
B+C\(\rightarrow \)A
 
Single best
CNN
94.92
94.67
79.71
89.77
 
DAN
96.55
96.79
95.29
96.21
 
JAN
97.71
97.39
96.76
97.29
 
DCORAL
97.30
95.37
89.74
94.14
 
DANN
97.18
96.42
91.06
94.89
Merged source domains
CNN
95.33
96.43
83.96
91.91
 
DAN
96.79
97.65
94.95
96.46
 
JAN
97.54
98.09
96.45
97.36
 
DCORAL
96.17
96.79
89.86
94.27
 
DANN
97.88
97.27
93.33
96.16
Multisource domain
MDAN
96.31
97.58
94.83
96.24
 
NWMDAN_a
97.47
98.26
96.11
97.28
 
NWMDAN_b
97.75
98.15
97.00
97.63
Proposed method
NWMDAN
99.18
99.73
98.57
99.16
(1)
Compared with single source-based transfer methods, merged source domains and multisource-based methods achieve higher diagnostic accuracy in most tasks, but negative transfers are observed in some tasks, for example, task A+B\(\rightarrow \)C. These results demonstrate that more diagnostic information can be found in multisource domains than in single-source domains, but extracting them is a complicated task.
 
(2)
Although the merged source domain approaches achieve an improvement in diagnostic accuracy compared with the single source domain-based approaches, their performance is lower than the proposed approach. It indicates that the multisource invariant features are not sufficiently extracted by the methods of the merged source domains. For the proposed NWMDAN, the average diagnostic accuracy remains above 99\(\%\) in all tasks, further demonstrating that the proposed method achieves efficient and accurate knowledge transfer.
 
(3)
The proposed NWMDAN outperforms MDAN in terms of diagnostic accuracy on the multisource tasks. Furthermore, the results validate the advantages and necessity of the proposed alignment strategy and non-uniform weighting scheme. Despite NWMDAN_a achieves more promising diagnostic results than MDAN, it is less competitive compared with the proposed method. Similarly, when the weighting scheme is removed, the diagnostic accuracy of the NWMDAN_b decreases on different tasks compared with the results of the proposed method.
 

4.4.2 Feature Visualization Results

To analyze the model performance and effectiveness more intuitively, the high-dimensional feature representations are visualized using the distributed stochastic neighbor embedding (T-SNE) technology. The feature visualization results of task B+C\(\rightarrow \)A are given in Fig. 7. As shown in Fig. 7(a), the CNN without using the transfer learning strategy has serious class confusion and unclear classification boundaries, which confirms the poor diagnostic results of the CNN in Table 5. Poor clustering performance and significant overlaps between classes are found in Fig. 7(b). This indicates that it is difficult to eliminate multi-domain shifts only by adversarial training methods. In Fig. 7(c), although a promising clustering performance is achieved, the samples of different categories are not well separated, which leads to a decline in classification accuracy. By contrast, separable boundaries and fused features among samples of different health statuses are observed clearly in Fig. 7 (d), which indicates the excellent properties of the proposed method in eliminating multi-domain shifts.

4.4.3 Weight Analysis

To analyze the effectiveness of the weighting scheme, the weight values assigned to multiple source domains in different diagnostic tasks are recorded in Fig. 8, where \({{w}_{1}}\) and \({{w}_{2}}\) denote the weight values of the first and second source domain, respectively.
As can be seen from Fig. 8, the weights estimated based on the MMD values are significantly correlated with the physical significance of the source domain working conditions. The source domains that have a smaller gap with the working conditions of the target domain are assigned a larger weight. The results in Figs. 8 and 9 show a similar trend between the weights and the single source transfer results. When a single source domain is used for transfer and achieves higher diagnostic accuracy, it is often given a higher weight in MDA scenario. In the tasks A+B\(\rightarrow \)C and B+C\(\rightarrow \)A, where the speed of A, B, C is 600, 800 and 1000 r/min, respectively, the B source domain whose speed is closest to the target domain is given the largest weight. In the diagnostic task A+C\(\rightarrow \)B where the gaps in working conditions between the source and target domains are equal, the A source domain with lower speed is selected as the best source. This suggests that the A source domain within the JNU dataset holds diagnostic information closely related to the target working condition.

4.5 Experimental Results and Analysis on Case1

4.5.1 Comparison of Classification Accuracy

Experiments on the Case1, which keeps the same experimental setup as the JNU case, are constructed for further validation of the effectiveness of NWMDAN. The detailed experimental results are recorded in Table 6. It can be seen that NWMDAN achieves a more competitive diagnostic performance than the comparison methods. More concretely, an average diagnostic accuracy 99.41\(\%\) is observed for the proposed method in all tasks. Meanwhile, the negative transfer is not observed in the diagnosis results of the NWMDAN. It is also observed that the proposed network retains a superior diagnostic performance compared to NWMDAN_a and NWMDAN_b, further validating the effectiveness of the intra-domain distribution alignment strategy and weighting scheme.
Table 6
Classification accuracy (%) on the Case1 with different methods
Groups
Methods
Transfer tasks
Average
D+E+F\(\rightarrow \)G
D+E+G\(\rightarrow \)F
D+F+G\(\rightarrow \)E
E+F+G\(\rightarrow \)D
Single best
CNN
96.67
81.37
90.63
87.50
89.04
 
DAN
98.94
97.09
95.95
92.16
96.04
 
JAN
99.21
97.95
97.18
91.19
96.38
 
DCORAL
98.68
94.36
96.04
90.40
94.87
 
DANN
98.86
93.83
96.48
92.51
95.42
Merged source domains
CNN
96.48
94.80
91.37
86.61
92.32
 
DAN
98.87
96.92
96.21
91.81
95.95
 
JAN
98.86
97.36
96.04
94.45
96.68
 
DCORAL
98.42
96.74
94.10
91.72
95.25
 
DANN
98.77
97.18
96.39
94.36
96.68
Multisource domain
MDAN
98.15
96.74
95.59
91.63
95.53
 
NWMDAN_a
99.30
98.06
98.15
94.27
97.45
 
NWMDAN_b
99.47
98.42
97.80
96.48
98.04
Proposed method
NWMDAN
99.91
99.47
99.74
98.98
99.41

4.5.2 Feature Visualization Results

Similar to the JNU case, the visualization of experimental results for the transfer task E+F+G\(\rightarrow \)D are illustrated in Fig. 10. The promising clustering performance can be observed in the visualization result of NWMDAN shown in Fig. 10(d), where different states of fault samples are obviously separated. By contrast, from Fig. 10(a), it is clear that there is a significant overlap among the classes, and poor alignment performance of the source and target domains is achieved. Figure 10(b) and (c) show clear changes in the distribution of features across categories, but there is a slight overlap between categories, which leads to a decrease in diagnosis performance.

4.5.3 Weight Analysis

The diagnosis tasks in the Case1 contain three source domains, which are different from the JNU case. In Fig. 11, the weight curves of the source domains on the four transfer tasks are recorded, where \({{w}_{1}}\), \({{w}_{2}}\), \({{w}_{3}}\) represent the weights of different source domains, respectively. Combined with the single-source domain transfer diagnosis results shown in Fig. 12, it can be observed that the assignment of weights is closely related to the working condition differences between the source and target domains. The F, G, D, and E source domains are given the largest weights of the four diagnostic tasks, respectively, which is consistent with the ranking of diagnostic results based on the single domain.

4.6 Experimental Results and Analysis on Case2

4.6.1 Comparison of Classification Accuracy

The diagnostic accuracy of the proposed NWMDAN and comparison methods on the Case2 dataset are shown in Table 7. It can be observed from Table 7 that NWMDAN has an average diagnostic accuracy of 99.76% on the three tasks, outperforming the comparison methods. On the task H+J\(\rightarrow \)I, NWMDAN achieves 100% classification accuracy, while the single best and merged source domain approaches cause degradation in classification accuracy. The classification performance of NWMDAN_a and NWMDAN_b outperforms MDAN, further validating the effectiveness of the proposed intra-domain distribution alignment strategy and non-uniform weighted domain adversarial framework. Compared to the JNU and Case1 datasets, the load variations on the Case2 dataset do not significantly increase the knowledge transfer challenge, and the NWMDAN and comparison methods show high accuracy of classification on all three tasks.
Table 7
Classification accuracy (%) on the Case2 with different methods
Groups
Methods
Transfer tasks
Average
H+I\(\rightarrow \)J
H+J\(\rightarrow \)I
I+J\(\rightarrow \)H
Single best
CNN
96.83
96.46
96.41
96.56
 
DAN
98.05
97.92
97.80
97.92
 
JAN
98.08
97.74
96.94
97.59
 
DCORAL
98.17
97.80
97.75
97.87
 
DANN
96.68
97.67
96.58
96.98
Merged source domains
CNN
96.82
96.79
95.97
96.53
 
DAN
97.44
99.51
96.70
97.88
 
JAN
98.05
98.41
98.53
98.33
 
DCORAL
97.93
98.29
96.94
97.72
 
DANN
97.56
98.66
97.07
97.76
Multisource domain
MDAN
98.35
98.77
97.90
98.34
 
NWMDAN_a
97.66
99.02
98.40
98.36
 
NWMDAN_b
98.97
99.27
99.02
99.09
Proposed method
NWMDAN
99.88
100.00
99.39
99.76

4.6.2 Feature Visualization Results

The results of the T-SNE visualizations on the Case2 dataset are presented in Fig. 13. The visualization results of NWMDAN shown in Fig. 13(d) show promising classification results with clear classification boundaries for different classes, and sample features in the target domain can be accurately matched to the corresponding source domain. In contrast, the visualizations shown in Fig. 13(a) and (b) exhibit poor distinguishability between the classes, and Fig. 13(c) presents a partial confusion of the classes.

4.6.3 Weight Analysis

The non-uniform weights during the training of the transfer task for the Case2 dataset are recorded in Fig. 14, and the diagnostic accuracy of single source domain-based methods is shown in Fig. 15. Different from the JNU dataset and Case1 dataset, the variations of load on the Case2 dataset are considered. The results in Fig. 14 are similar to JNU and Case1, showing that the weights of different sources are are physically correlated. Sources with high similarity to the target domain working conditions tend to be assigned higher weights. Conversely, the contributions of source domains that differ significantly from the target working conditions are reduced.

5 Conclusion

In this study, a multisource domain IFD method, NWMDAN, considering non-uniform weights of source domains and domain shift between each pair of source and target domain is proposed. The proposed NWMDAN can precisely align multisource domains with the target domain and extract multi-domain invariant features effectivelly. In particular, the proposed NMWDAN eliminates the domain shifts between each pair of source and target domains by an intra-domain distribution alignment strategy proposed in this study. Considering the different contribution of different source domains to the target domains, a non-uniform weighting scheme is designed to measure the relative importance of different source domains. Moreover, to further learn and combine multisource diagnostic information, a non-uniformly weighted domain adversarial framework is designed. The experimental results on three datasets demonstrate the effectiveness of the proposed NWMDAN. Comprehensive results show that the proposed network obtains promising DA effect and more competitive diagnostic performance compared with the comparison methods.
Although the proposed method is expected to achieve good MDA performance, it is limited by the assumption that the label spaces of the source and target domains are equivalent. Future research will focus on a more complex multisource cross-domain diagnostic scenario with asymmetric labeling spaces of the source and target domains.

Acknowledgements

This work was supported by the Natural Science Foundation of Anhui Province, China [Grant Number 2108085MG236]; the Natural Science Foundation of Anhui Province, China [Grant Number 2208085MG181]; and the Natural Science Foundation from the Education Bureau of Anhui Province, China [Grant Number KJ2021A0385].

Declarations

conficts of interest

The author(s) declared no potential conficts of interest concerning the research, authorship, and/or publication of this article.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
20.
Zurück zum Zitat Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. The journal of machine learning research 17(1):2096–2030MathSciNet Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. The journal of machine learning research 17(1):2096–2030MathSciNet
29.
Zurück zum Zitat Blitzer J, Crammer K, Kulesza A, Pereira F, Wortman J (2007) Learning bounds for domain adaptation. Adv Neural Inf Process Syst 20:129–136 Blitzer J, Crammer K, Kulesza A, Pereira F, Wortman J (2007) Learning bounds for domain adaptation. Adv Neural Inf Process Syst 20:129–136
41.
Zurück zum Zitat Long M, Cao Y, Wang J, Jordan M. Learning transferable features with deep adaptation networks. In: International conference on machine learning, pp 97–105. PMLR Long M, Cao Y, Wang J, Jordan M. Learning transferable features with deep adaptation networks. In: International conference on machine learning, pp 97–105. PMLR
43.
Zurück zum Zitat Long M, Zhu H, Wang J, Jordan MI (2017) Deep transfer learning with joint adaptation networks. In: 34th International conference on machine learning. Proc Mach Learn Res 70:2208–2217 Long M, Zhu H, Wang J, Jordan MI (2017) Deep transfer learning with joint adaptation networks. In: 34th International conference on machine learning. Proc Mach Learn Res 70:2208–2217
Metadaten
Titel
Non-Uniformly Weighted Multisource Domain Adaptation Network For Fault Diagnosis Under Varying Working Conditions
verfasst von
Hongliang Zhang
Yuteng Zhang
Rui Wang
Haiyang Pan
Bin Chen
Publikationsdatum
01.04.2024
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 2/2024
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-024-11568-2

Weitere Artikel der Ausgabe 2/2024

Neural Processing Letters 2/2024 Zur Ausgabe

Neuer Inhalt