Wasserstein Distance based Deep Adversarial Transfer Learning for Intelligent Fault Diagnosis (1903.06753v1)

Published 2 Mar 2019 in cs.LG, eess.SP, and stat.ML

Abstract: The demand of artificial intelligent adoption for condition-based maintenance strategy is astonishingly increased over the past few years. Intelligent fault diagnosis is one critical topic of maintenance solution for mechanical systems. Deep learning models, such as convolutional neural networks (CNNs), have been successfully applied to fault diagnosis tasks for mechanical systems and achieved promising results. However, for diverse working conditions in the industry, deep learning suffers two difficulties: one is that the well-defined (source domain) and new (target domain) datasets are with different feature distributions; another one is the fact that insufficient or no labelled data in target domain significantly reduce the accuracy of fault diagnosis. As a novel idea, deep transfer learning (DTL) is created to perform learning in the target domain by leveraging information from the relevant source domain. Inspired by Wasserstein distance of optimal transport, in this paper, we propose a novel DTL approach to intelligent fault diagnosis, namely Wasserstein Distance based Deep Transfer Learning (WD-DTL), to learn domain feature representations (generated by a CNN based feature extractor) and to minimize the distributions between the source and target domains through adversarial training. The effectiveness of the proposed WD-DTL is verified through 3 transfer scenarios and 16 transfer fault diagnosis experiments of both unsupervised and supervised (with insufficient labelled data) learning. We also provide a comprehensive analysis of the network visualization of those transfer tasks.

Citations (207)

View on Semantic Scholar

Summary

The paper presents WD-DTL, a novel framework that reduces domain discrepancies using Wasserstein distance for efficient fault diagnosis.
It integrates CNN-based feature extraction with a domain critic and adversarial training to align source and target feature distributions.
Experiments demonstrate accuracy improvements over 13% in speed transfers and nearly 25% in location transfers, confirming its robustness with limited labeled data.

A Review and Analysis of Wasserstein Distance based Deep Adversarial Transfer Learning for Intelligent Fault Diagnosis

The paper introduces a novel approach titled Wasserstein Distance based Deep Transfer Learning (WD-DTL) aimed at enhancing intelligent fault diagnosis in mechanical systems. The primary challenge addressed in this paper is the variability in feature distributions across different work conditions in industrial settings, which complicates the fault diagnosis task when relying solely on deep learning models such as CNNs. Specifically, two major difficulties are identified: domain shift between source and target datasets and insufficient labeled data in the target domain.

Problem Background and Contribution

Traditional deep learning models struggle with domain shifts where the source and target datasets have different feature distributions. This challenge results in a need for deep models to be retrained from scratch when faced with new diagnosis tasks, leading to inefficient use of resources. The lack of labeled data in target domains further compounds this issue. Deep transfer learning (DTL) is identified as a potential solution to these problems, with the ability to leverage relevant information from labeled source domains to improve the learning and performance in the target domain.

The paper introduces the WD-DTL framework, which incorporates the use of Wasserstein distance to resolve domain discrepancies. This distance metric, rooted in optimal transport theory, offers a compelling alternative due to its advantageous gradient properties compared to methods such as Maximum Mean Discrepancy (MMD). The paper highlights that WD-DTL integrates adversarial training to minimize domain feature discrepancies using CNN-derived features, thus facilitating better domain adaptation.

Methodology

The paper delineates the architecture of WD-DTL, comprising three key components: a CNN-based feature extractor, a domain critic, and a discriminator. The CNN feature extractor is pre-trained on source domain data to learn initial feature representations effectively. The domain critic utilizes Wasserstein distance to reduce the distribution gap between source and target domains by adversarially training the feature extractor. Finally, a discriminator further refines the learned representations for improved classification accuracy.

Experimental Scenarios

WD-DTL is evaluated through 16 transfer tasks across three scenarios: unsupervised transfer between different motor speeds and sensor locations, and supervised transfer with limited labeled data at different sensor locations. This setup explores both the unsupervised domain adaptation ability of the proposed method and its effectiveness with limited supervised data provision.

Results and Implications

The paper presents comprehensive results demonstrating WD-DTL’s superior performance over baseline CNN and competing transfer learning approaches such as DAN. Specifically, WD-DTL shows average accuracy improvements exceeding 13% for speed transfers and close to 25% for location transfers. The paper claims that WD-DTL outperforms in both unsupervised and supervised settings, proving the versatility of the Wasserstein distance in adapting to domain shifts in intelligent fault diagnosis.

Furthermore, visualization through t-SNE embeddings supports the efficacy of WD-DTL, showcasing well-separated clusters for different fault types post-adaptation. The method's robustness to variations and its efficacy with minimal labeled data are also highlighted as significant strengths.

Future Directions and Conclusion

The implications of this work are substantial for AI applications in industrial maintenance, where domain adaptation is frequently necessary due to ever-changing operational conditions. While the current paper is promising, future exploration is encouraged in incorporating signal processing techniques to further refine feature extraction, particularly under noisy conditions. Additionally, testing WD-DTL in broader industrial scenarios beyond motor speeds and sensor locations may pave the way for more comprehensive intelligent diagnostic solutions.

In conclusion, the WD-DTL framework offers a significant advancement in domain adaptation for fault diagnosis, utilizing the robust and theoretically grounded Wasserstein distance to bridge source-target domain gaps efficiently. This work sets a noteworthy benchmark for future transfer learning applications in industrial AI systems.

PDF Markdown