TranDRL: A Transformer-Driven Deep Reinforcement Learning Enabled Prescriptive Maintenance Framework (2309.16935v3)
Abstract: Industrial systems demand reliable predictive maintenance strategies to enhance operational efficiency and reduce downtime. This paper introduces an integrated framework that leverages the capabilities of the Transformer model-based neural networks and deep reinforcement learning (DRL) algorithms to optimize system maintenance actions. Our approach employs the Transformer model to effectively capture complex temporal patterns in sensor data, thereby accurately predicting the remaining useful life (RUL) of an equipment. Additionally, the DRL component of our framework provides cost-effective and timely maintenance recommendations. We validate the efficacy of our framework on the NASA C-MPASS dataset, where it demonstrates significant advancements in both RUL prediction accuracy and the optimization of maintenance actions, compared to the other prevalent machine learning-based methods. Our proposed approach provides an innovative data-driven framework for industry machine systems, accurately forecasting equipment lifespans and optimizing maintenance schedules, thereby reducing downtime and cutting costs.
- E. L. X. Li Da Xu and L. Li, “Industry 4.0: state of the art and future trends,” International Journal of Production Research, vol. 56, no. 8, pp. 2941–2962, 2018.
- M. D. Dangut, I. K. Jennions, S. King, and Z. Skaf, “Application of deep reinforcement learning for extremely rare failure prediction in aircraft maintenance,” Mechanical Systems and Signal Processing, vol. 171, p. 108873, 2022.
- D. A. Tobon-Mejia, K. Medjaher, and N. Zerhouni, “The ISO 13381-1 standard’s failure prognostics process through an example,” in 2010 Prognostics and System Health Management Conference, 2010, pp. 1–12.
- K. S. H. Ong, W. Wang, D. Niyato, and T. Friedrichs, “Deep-reinforcement-learning-based predictive maintenance model for effective resource management in industrial IoT,” IEEE Internet of Things Journal, vol. 9, no. 7, pp. 5173–5188, 2021.
- N. Ding and R. Soricut, “Cold-start reinforcement learning with softmax policy gradient,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017.
- T. Zonta, C. A. Da Costa, R. da Rosa Righi, M. J. de Lima, E. S. da Trindade, and G. P. Li, “Predictive maintenance in the industry 4.0: A systematic literature review,” Computers & Industrial Engineering, vol. 150, p. 106889, 2020.
- Q. Zhou, J. Son, S. Zhou, X. Mao, and M. Salman, “Remaining useful life prediction of individual units subject to hard failure,” IIE Transactions, vol. 46, no. 10, pp. 1017–1030, 2014.
- J. Man and Q. Zhou, “Prediction of hard failures with stochastic degradation signals using wiener process and proportional hazards model,” Computers & Industrial Engineering, vol. 125, pp. 480–489, 2018.
- R. Zhao, R. Yan, Z. Chen, K. Mao, P. Wang, and R. X. Gao, “Deep learning and its applications to machine health monitoring,” Mechanical Systems and Signal Processing, vol. 115, pp. 213–237, 2019.
- F. Jia, Y. Lei, J. Lin, X. Zhou, and N. Lu, “Deep neural networks: A promising tool for fault characteristic mining and intelligent diagnosis of rotating machinery with massive data,” Mechanical systems and signal processing, vol. 72, pp. 303–315, 2016.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang, “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in Proceedings of the AAAI conference on artificial intelligence, vol. 35, no. 12, 2021, pp. 11 106–11 115.
- J. Huang, Q. Chang, and J. Arinez, “Deep reinforcement learning based preventive maintenance policy for serial production lines,” Expert Systems with Applications, vol. 160, p. 113701, 2020.
- C. A. Gordon and E. N. Pistikopoulos, “Data-driven prescriptive maintenance toward fault-tolerant multiparametric control,” AIChE Journal, vol. 68, no. 6, p. e17489, 2022.
- F. Ansari, R. Glawar, and T. Nemeth, “PriMa: a prescriptive maintenance model for cyber-physical production systems,” International Journal of Computer Integrated Manufacturing, vol. 32, no. 4-5, pp. 482–503, 2019.
- T. Nemeth, F. Ansari, W. Sihn, B. Haslhofer, and A. Schindler, “PriMa-X: A reference model for realizing prescriptive maintenance and assessing its maturity enhanced by machine learning,” Procedia CIRP, vol. 72, pp. 1039–1044, 2018.
- C. Hakim, “Developing a sociology for the twenty-first century: Preference theory,” The British journal of sociology, vol. 49, no. 1, pp. 137–143, 1998.
- P. F. Christiano, J. Leike, T. Brown, M. Martic, S. Legg, and D. Amodei, “Deep reinforcement learning from human preferences,” Advances in neural information processing systems, vol. 30, 2017.
- B. Ibarz, J. Leike, T. Pohlen, G. Irving, S. Legg, and D. Amodei, “Reward learning from human preferences and demonstrations in atari,” Advances in neural information processing systems, vol. 31, 2018.
- K. Lee, L. Smith, and P. Abbeel, “Pebble: Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training,” arXiv preprint arXiv:2106.05091, 2021.
- J. Park, Y. Seo, J. Shin, H. Lee, P. Abbeel, and K. Lee, “SURF: Semi-supervised reward learning with data augmentation for feedback-efficient preference-based reinforcement learning,” arXiv preprint arXiv:2203.10050, 2022.
- J. Early, T. Bewley, C. Evers, and S. Ramchurn, “Non-markovian reward modelling from trajectory labels via interpretable multiple instance learning,” Advances in Neural Information Processing Systems, vol. 35, pp. 27 652–27 663, 2022.
- P. Fernandes, A. Madaan, E. Liu, A. Farinhas, P. H. Martins, A. Bertsch, J. G. de Souza, S. Zhou, T. Wu, G. Neubig et al., “Bridging the gap: A survey on integrating (human) feedback for natural language generation,” arXiv preprint arXiv:2305.00955, 2023.
- H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale et al., “Llama 2: Open foundation and fine-tuned chat models,” arXiv preprint arXiv:2307.09288, 2023.
- A. L. Patti, K. Watson, and J. H. Blackstone Jr, “The shape of protective capacity in unbalanced production systems with unplanned machine downtime,” Production Planning and Control, vol. 19, no. 5, pp. 486–494, 2008.
- Q. Wen, T. Zhou, C. Zhang, W. Chen, Z. Ma, J. Yan, and L. Sun, “Transformers in time series: A survey,” arXiv preprint arXiv:2202.07125, 2022.
- A. Saxena, K. Goebel, D. Simon, and N. Eklund, “Damage propagation modeling for aircraft engine run-to-failure simulation,” in 2008 international conference on prognostics and health management. IEEE, 2008, pp. 1–9.
- F. Heimes, “Recurrent neural networks for remaining useful life estimation,” 2008 International Conference on Prognostics and Health Management, pp. 1–6, 2008.
- H. Zhang, Y. Zou, X. Yang, and H. Yang, “A temporal fusion transformer for short-term freeway traffic speed multistep prediction,” Neurocomputing, vol. 500, pp. 329–340, 2022.
- B. Lim, S. Ö. Arık, N. Loeff, and T. Pfister, “Temporal fusion transformers for interpretable multi-horizon time series forecasting,” International Journal of Forecasting, vol. 37, no. 4, pp. 1748–1764, 2021.
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
- B. N. Oreshkin, D. Carpov, N. Chapados, and Y. Bengio, “N-BEATS: Neural basis expansion analysis for interpretable time series forecasting,” arXiv preprint arXiv:1905.10437, 2019.
- D. Salinas, V. Flunkert, J. Gasthaus, and T. Januschowski, “DeepAR: Probabilistic forecasting with autoregressive recurrent networks,” International Journal of Forecasting, vol. 36, no. 3, pp. 1181–1191, 2020.
- J. Fan, Z. Wang, Y. Xie, and Z. Yang, “A theoretical analysis of deep q-learning,” in Learning for dynamics and control. PMLR, 2020, pp. 486–489.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
- T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in International conference on machine learning. PMLR, 2018, pp. 1861–1870.
- Yang Zhao (382 papers)
- Wenbo Wang (98 papers)
- Helin Yang (13 papers)
- Dusit Niyato (671 papers)
- Jiaxi yang (32 papers)