Deep Learning Algorithms for Rotating Machinery Intelligent Diagnosis: An Open Source Benchmark Study (2003.03315v3)

Published 6 Mar 2020 in eess.SP and cs.LG

Abstract: With the development of deep learning (DL) techniques, rotating machinery intelligent diagnosis has gone through tremendous progress with verified success and the classification accuracies of many DL-based intelligent diagnosis algorithms are tending to 100\%. However, different datasets, configurations, and hyper-parameters are often recommended to be used in performance verification for different types of models, and few open source codes are made public for evaluation and comparisons. Therefore, unfair comparisons and ineffective improvement may exist in rotating machinery intelligent diagnosis, which limits the advancement of this field. To address these issues, we perform an extensive evaluation of four kinds of models, including multi-layer perception (MLP), auto-encoder (AE), convolutional neural network (CNN), and recurrent neural network (RNN), with various datasets to provide a benchmark study within the same framework. We first gather most of the publicly available datasets and give the complete benchmark study of DL-based intelligent algorithms under two data split strategies, five input formats, three normalization methods, and four augmentation methods. Second, we integrate the whole evaluation codes into a code library and release this code library to the public for better development of this field. Third, we use specific-designed cases to point out the existing issues, including class imbalance, generalization ability, interpretability, few-shot learning, and model selection. By these works, we release a unified code framework for comparing and testing models fairly and quickly, emphasize the importance of open source codes, provide the baseline accuracy (a lower bound) to avoid useless improvement, and discuss potential future directions in this field. The code library is available at https://github.com/ZhaoZhibin/DL-based-Intelligent-Diagnosis-Benchmark.

Citations (361)

View on Semantic Scholar

Summary

The paper provides a comprehensive benchmark analysis comparing four deep learning models on nine datasets to identify performance variations and best practices.
The paper demonstrates that frequency domain inputs and proper data augmentation significantly boost diagnostic accuracy while mitigating overfitting.
The paper releases an open-source code library to enhance reproducibility and guide future research in addressing class imbalance, interpretability, and domain adaptability.

Evaluation of Deep Learning Algorithms for Rotating Machinery Diagnosis

The paper, "Deep Learning Algorithms for Rotating Machinery Intelligent Diagnosis: An Open Source Benchmark Study," authored by Zhibin Zhao et al., addresses significant challenges in deep learning (DL) applications for rotating machinery diagnosis. The absence of standardized datasets, inconsistent hyper-parameter tuning, and limited open-source code repositories result in disparate evaluation outcomes and hinder advancement in the field. This paper provides a comprehensive benchmark analysis of DL models using publicly available datasets and standardized evaluation protocols, contributing to fairer and more effective comparisons for future research.

Methodological Approach

The authors investigate four DL models: multi-layer perception (MLP), auto-encoder (AE), convolutional neural network (CNN), and recurrent neural network (RNN). They evaluate these models across nine datasets, focusing primarily on seven due to labeling limitations in some datasets. The investigation highlights several preprocessing techniques, including input normalization and data augmentation, and the impact of different data split strategies on model performance.

Key Findings

Dataset Influence:
- The models achieved over 95% accuracy in all cases except for the UoC dataset, highlighting significant variance in dataset difficulty.
- Datasets were ranked based on diagnostic difficulty, revealing insights into their suitability for benchmarking diagnostic models.
Input Format Impact:
- Frequency domain inputs consistently resulted in higher accuracy than time domain and other transformed inputs, indicating the importance of feature richness achievable through frequency analysis.
Model Performance:
- CNN models often surpassed AE models in accuracy, particularly with complex datasets. However, the AE models demonstrated superior performance in certain datasets such as MFPT and UoC, raising considerations for overfitting in CNNs with small datasets.
Data Augmentation and Normalization:
- Augmentation strategies generally improved model robustness in datasets with lower baseline accuracy.
- Z-score normalization appeared to provide a slight edge in model performance across different datasets and models.

Practical Implications and Future Directions

The authors release a comprehensive code library to facilitate further comparative studies within the community, enhancing reproducibility and collaborative development. The benchmark provides a lower-bound accuracy standard that can guide the evaluation of emerging models. Moreover, this paper identifies crucial issues requiring further research: class imbalance, generalization capabilities, interpretability, few-shot learning, and efficient model selection.

Generalization and Transfer Learning

The paper highlights inadequate generalization across varying operational conditions, underscoring a vital need for robust transfer learning methodologies. This could involve domain adaptation techniques or leveraging large-scale meta-learning strategies to improve adaptability.

Class Imbalance

Imbalanced datasets, prevalent in industrial diagnostics, lead to skewed model training and performance evaluation. Addressing this requires novel strategies, potentially involving synthetic data generation or cost-sensitive learning.

Interpretability and Transparency

Despite achieving high diagnostic accuracy, DL models often lack interpretability, posing risks in critical applications. This calls for future work focusing on explainability techniques tailored to DL diagnostics, ensuring that model decisions are transparent and justifiable.

Few-Shot and Efficient Learning

Assembling large annotated datasets is often infeasible. Few-shot learning paradigms, which leverage minimal data to generate actionable insights, represent a promising direction, possibly utilizing transfer learning or augmentation-enhanced strategies.

Conclusion

Zhao et al.'s research acts as a pivotal reference for upcoming advancements in DL-based machinery diagnostics. By offering a deeply structured evaluation framework and emphasizing open-source dissemination, it fosters transparency and innovation across the research community. Addressing the identified research gaps will be instrumental in cementing DL's role in reliable, intelligent industrial diagnostics.

PDF Markdown

Related Papers

GitHub

GitHub - ZhaoZhibin/DL-based-Intelligent-Diagnosis-Benchmark: Source codes for the paper "Deep Learning Algorithms for Rotating Machinery Intelligent Diagnosis: An Open Source Benchmark Study" (587 stars)