Papers
Topics
Authors
Recent
2000 character limit reached

Reinforcement Twinning: from digital twins to model-based reinforcement learning (2311.03628v4)

Published 7 Nov 2023 in eess.SY and cs.SY

Abstract: Digital twins promise to revolutionize engineering by offering new avenues for optimization, control, and predictive maintenance. We propose a novel framework for simultaneously training the digital twin of an engineering system and an associated control agent. The twin's training combines adjoint-based data assimilation and system identification methods, while the control agent's training merges model-based optimal control with model-free reinforcement learning. The control agent evolves along two independent paths: one driven by model-based optimal control and the other by reinforcement learning. The digital twin serves as a virtual environment for confrontation and indirect interaction, functioning as an "expert demonstrator." The best policy is selected for real-world interaction and cloned to the other path if training stagnates. We call this framework Reinforcement Twinning (RT). The framework is tested on three diverse engineering systems and control tasks: (1) controlling a wind turbine under varying wind speeds, (2) trajectory control of flapping-wing micro air vehicles (FWMAVs) facing wind gusts, and (3) mitigating thermal loads in managing cryogenic storage tanks. These test cases use simplified models with known ground truth closure laws. Results show that the adjoint-based digital twin training is highly sample-efficient, completing within a few iterations. For the control agent training, both model-based and model-free approaches benefit from their complementary learning experiences. The promising results pave the way for implementing the RT framework on real systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (167)
  1. Tutorial overview of model predictive control. IEEE Control Syst 2000;20(3):38–52. URL: https://ieeexplore.ieee.org/document/845037/. doi:10.1109/37.845037.
  2. Machine learning, deepest learning: Statistical data assimilation problems. 2017. arXiv:1707.01415.
  3. An update to the national renewable energy laboratory baseline wind turbine controller. In: Journal of Physics: Conference Series. IOP Publishing; volume 1452; 2020. p. 012002.
  4. PyDA: A hands-on introduction to dynamical data assimilation with python. Fluids 2020;5(4):225. doi:10.3390/fluids5040225.
  5. Digital Twins in the Construction Industry: A Perspective of Practitioners and Building Authority. Front Built Environ 2022;8:834671. URL: https://www.frontiersin.org/articles/10.3389/fbuil.2022.834671/full. doi:10.3389/fbuil.2022.834671.
  6. Deep convolutional networks in system identification. In: 2019 IEEE 58th Conference on Decision and Control (CDC). IEEE; 2019. doi:10.1109/cdc40024.2019.9030219.
  7. Deep data assimilation: Integrating deep learning with data assimilation. Applied Sciences 2021;11(3):1114. doi:10.3390/app11031114.
  8. Data Assimilation. Society for Industrial and Applied Mathematics, 2016. doi:10.1137/1.9781611974546.
  9. Astrom KJ, Wittenmark B. Adaptive Control. 2nd ed. Prentice Hall, 1994.
  10. Learning Dynamical Systems from Partial Observations. 2019. URL: http://arxiv.org/abs/1902.11136; arXiv:1902.11136 [physics].
  11. Workshop report on basic research needs for scientific machine learning: Core technologies for artificial intelligence 2019;URL: https://www.osti.gov/biblio/1478744. doi:10.2172/1478744.
  12. A survey on digital twin: Definitions, characteristics, applications, and design implications. IEEE Access 2019;7:167653--7. doi:10.1109/access.2019.2953499.
  13. Barsi S. Ventless Pressure Control of Cryogenic Storage Tanks. Ph.D. thesis; CASE Western Reserve University; 2011.
  14. Barsi S, Kassemi M. Investigation of tank pressurization and pressure control—part i: Experimental study. Journal of Thermal Science and Engineering Applications 2013a;5(4). doi:10.1115/1.4023891.
  15. Barsi S, Kassemi M. Investigation of tank pressurization and pressure control—part II: Numerical modeling. Journal of Thermal Science and Engineering Applications 2013b;5(4). doi:10.1115/1.4023892.
  16. Digital twins for the designs of systems: a perspective. Structural and Multidisciplinary Optimization 2023;66(3). doi:10.1007/s00158-023-03488-x.
  17. Bertsekas D. Reinforcement Learning and Optimal Control. Athena Scientific, 2019.
  18. Natural actor–critic algorithms. Automatica 2009;45(11):2471--82. URL: https://linkinghub.elsevier.com/retrieve/pii/S0005109809003549. doi:10.1016/j.automatica.2009.07.008.
  19. Bhowmik S, Spee R. Performance Optimization for Doubly Fed Wind Power Generation Systems. IEEE TRANSACTIONS ON INDUSTRY APPLICATIONS 1999;35(4).
  20. Wind turbine control systems: principles, modelling and gain scheduling design. volume 19. Springer, 2007.
  21. Bocquet M. Ensemble kalman filtering without the intrinsic need for inflation. Nonlinear Processes in Geophysics 2011;18(5):735--50. doi:10.5194/npg-18-735-2011.
  22. Data assimilation as a deep learning tool to infer ode representations of dynamical models. 2019. .
  23. Bocquet M, Farchi A. Introduction to the principles and methods of data assimilation in the geosciences. Technical Report; École des Ponts ParisTech; 2023.
  24. Approximate real-time optimal control based on sparse gaussian process models. In: 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). 2014a. p. 1--8. doi:10.1109/ADPRL.2014.7010608.
  25. Approximate real-time optimal control based on sparse gaussian process models. In: 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). 2014b. p. 1--8. doi:10.1109/ADPRL.2014.7010608.
  26. Bossanyi EA. The Design of closed loop controllers for wind turbines. Wind Energ 2000;3(3):149--63. URL: https://onlinelibrary.wiley.com/doi/10.1002/we.34. doi:10.1002/we.34.
  27. Bradley AM. Pde-constrained optimization and the adjoint method 2019;.
  28. Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the Lorenz 96 model. Journal of Computational Science 2020;44:101171. URL: http://arxiv.org/abs/2001.01520. doi:10.1016/j.jocs.2020.101171; arXiv:2001.01520 [physics, stat].
  29. A digital-twin solution for floating offshore wind turbines validated using a full-scale prototype. Wind Energy Science Discussions 2023;2023:1--34. URL: https://wes.copernicus.org/preprints/wes-2023-50/. doi:10.5194/wes-2023-50.
  30. Curriculum learning for data-driven modeling of dynamical systems. The European Physical Journal E 2023;46(3). doi:10.1140/epje/s10189-023-00269-8.
  31. Control-oriented model learning with a recurrent neural network. Bulletin of the American Physical Society 2018;63.
  32. Control of chaotic systems by deep reinforcement learning. Proceedings of the Royal Society A 2019;475(2231):20190351.
  33. Data learning: Integrating data assimilation and machine learning. Journal of Computational Science 2022;58:101525. doi:10.1016/j.jocs.2021.101525.
  34. A cfd data-driven aerodynamic model for fast and precise prediction of flapping aerodynamics in various flight velocities. Journal of Fluid Mechanics 2021;915:A114.
  35. Model-free control of dynamical systems with deep reservoir computing. 2020. arXiv:2010.02285.
  36. Adjoint sensitivity analysis for differential-algebraic equations: The adjoint DAE system and its numerical solution. SIAM Journal on Scientific Computing 2003;24(3):1076--89. doi:10.1137/s1064827501380630.
  37. Data assimilation in the geosciences - an overview on methods, issues and perspectives. 2017. arXiv:1709.02798.
  38. Cengel Y, Ghajar A. Heat and Mass Transfer: Fundamentals and Applications. 6th ed. McGraw Hill, 2019.
  39. Chai PR, Wilhite AW. Cryogenic thermal system analysis for orbital propellant depot. 2014. URL: https://linkinghub.elsevier.com/retrieve/pii/S0094576514001738. doi:10.1016/j.actaastro.2014.05.013.
  40. Chang CC, Lin CJ. LIBSVM. ACM Transactions on Intelligent Systems and Technology 2011;2(3):1--27. doi:10.1145/1961189.1961199.
  41. A survey on policy search algorithms for learning robot controllers in a handful of trials. 2019. arXiv:1807.02303.
  42. Neural ordinary differential equations. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook, NY, USA: Curran Associates Inc.; NIPS’18; 2018. p. 6572–6583.
  43. Non-linear system identification using neural networks. International Journal of Control 1990;51(6):1191--214. doi:10.1080/00207179008934126.
  44. Flight mechanics and control of escape manoeuvres in hummingbirds. i. flight kinematics. Journal of Experimental Biology 2016;219(22):3518--31.
  45. Machine learning with data assimilation and uncertainty quantification for dynamical systems: a review. 2023. URL: http://arxiv.org/abs/2303.10462; arXiv:2303.10462 [cs].
  46. Virtual, Digital and Hybrid Twins: A New Paradigm in Data-Based Engineering and Engineered Data. Arch Computat Methods Eng 2020;27(1):105--34. URL: http://link.springer.com/10.1007/s11831-018-9301-4. doi:10.1007/s11831-018-9301-4.
  47. A reinforcement-learning approach for individual pitch control. Wind Energy 2022;25(8):1343--62.
  48. The influence of incoming turbulence on the dynamic modes of an nrel-5mw wind turbine wake. Renewable Energy 2022a;183:601--16.
  49. Dynamic-mode-decomposition of the wake of the nrel-5mw wind turbine impinged by a laminar inflow. Renewable Energy 2022b;199:1--10.
  50. Deisenroth M, Rasmussen C. Pilco: A model-based and data-efficient approach to policy search. 2011. p. 465--72.
  51. Wing rotation and the aerodynamic basis of insect flight. Science 1999;284(5422):1954--60.
  52. Variational data assimilation: Optimization and optimal control. In: Data Assimilation for Atmospheric, Oceanic and Hydrologic Applications (Vol. III). Springer International Publishing; 2016. p. 1--53. doi:10.1007/978-3-319-43415-5_1.
  53. Errico RM. What is an adjoint model? Bulletin of the American Meteorological Society 1997;78(11).
  54. Evensen G. Data Assimilation: The Ensemble Kalman Filter. 2nd ed. Spinger, 2009.
  55. Machine learning-based digital twin for predictive modeling in wind turbines. IEEE Access 2022;10:14184--94. URL: https://api.semanticscholar.org/CorpusID:246420062.
  56. Flappy hummingbird: An open source dynamic simulation of flapping wing robots and animals. In: 2019 International Conference on Robotics and Automation (ICRA). IEEE; 2019. p. 9223--9.
  57. Geer AJ. Learning earth system models from observations: machine learning or data assimilation? Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 2021;379(2194). doi:10.1098/rsta.2020.0089.
  58. Gonzalez J, Yu W. Non-linear system modeling using LSTM neural networks. IFAC-PapersOnLine 2018;51(13):485--9. doi:10.1016/j.ifacol.2018.07.326.
  59. Deep learning. volume 1. MIT press Cambridge, 2016.
  60. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. 2018. arXiv:1801.01290.
  61. Predictive digital twin for offshore wind farms. Energy Informatics 2023;6(1):1--26.
  62. Recent progress in flapping wings for micro aerial vehicle applications. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science 2020;235(2):245--64. doi:10.1177/0954406220917426.
  63. Marshall space flight center in-space cryogenic fluid management program overview. In: 41st AIAA/ASME/SAE/ASEE Joint Propulsion Conference and Exhibit. American Institute of Aeronautics and Astronautics; 2005. doi:10.2514/6.2005-3561.
  64. Spray Bar Zero-Gravity Vent System for On-Orbit Liquid Hydrogen Storage. Technical Memorandum (TM) NASA/TM-2003-212926; National Aeronautics and Space Administration, Marshall Space Flight Center; Alabama 35812; 2003. URL: https://ntrs.nasa.gov/api/citations/20040000092/downloads/20040000092.pdf.
  65. Nonlinear modeling, estimation and predictive control in APMonitor. Computers & Chemical Engineering 2014;70:133--48. URL: https://linkinghub.elsevier.com/retrieve/pii/S0098135414001306. doi:10.1016/j.compchemeng.2014.04.013.
  66. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation 1997;9(8):1735--80. doi:10.1162/neco.1997.9.8.1735.
  67. Parameter Identification of Wind Turbine for Maximum Power-point Tracking Control. Electric Power Components and Systems 2010;38(5):603--14. URL: http://www.tandfonline.com/doi/abs/10.1080/15325000903376974. doi:10.1080/15325000903376974.
  68. Neural networks for control systems—a survey. Automatica 1992;28(6):1083--112. URL: https://www.sciencedirect.com/science/article/pii/000510989290053I. doi:https://doi.org/10.1016/0005-1098(92)90053-I.
  69. Ground based experiment and numerical calculation on thermodynamic vent system in propellant tank for future cryogenic propulsion system. Cryogenics 2020;109:103095. doi:10.1016/j.cryogenics.2020.103095.
  70. Jaeger H, Haas H. Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science 2004;304(5667):78--80. doi:10.1126/science.1091277.
  71. Transient thermal behavior of multi-layer insulation coupled with vapor cooled shield used for liquid hydrogen storage tank. Energy 2021;231:120859. doi:10.1016/j.energy.2021.120859.
  72. Johnson KE. Adaptive torque control of variable speed wind turbines. University of Colorado at Boulder, 2004.
  73. Johnson KE. Control of variable-speed wind turbines: standard and adaptive techniques for maximizing energy capture. IEEE Control Systems Magazine 2006;26(3):70--81.
  74. Jonkman BJ. TurbSim user’s guide. Technical Report; National Renewable Energy Lab.(NREL), Golden, CO (United States); 2006.
  75. Definition of a 5-MW reference wind turbine for offshore system development. Technical Report; National Renewable Energy Lab.(NREL), Golden, CO (United States); 2009.
  76. 4-d-var or ensemble kalman filter? Tellus A: Dynamic Meteorology and Oceanography 2007;59(5):758. doi:10.1111/j.1600-0870.2007.00261.x.
  77. A scalable inference method for large dynamic economic systems. 2021. arXiv:2110.14346.
  78. Hummingbird wing efficacy depends on aspect ratio and compares with helicopter rotors. Journal of the royal society interface 2014;11(99):20140585.
  79. Model-ensemble trust-region policy optimization. arXiv preprint arXiv:180210592 2018;.
  80. Data Assimilation. Springer Berlin Heidelberg, 2010. doi:10.1007/978-3-540-74703-1.
  81. Control of wind turbines: Past, present, and future. In: 2009 American Control Conference. IEEE; 2009. doi:10.1109/acc.2009.5160590.
  82. Backpropagation applied to handwritten zip code recognition. Neural Computation 1989;1(4):541--51. doi:10.1162/neco.1989.1.4.541.
  83. A quasi-steady aerodynamic model for flapping flight with improved adaptability. Bioinspiration & biomimetics 2016;11(3):036005.
  84. Leishman JG. Challenges in modelling the unsteady aerodynamics of wind turbines. Wind Energy: An International Journal for Progress and Applications in Wind Power Conversion Technology 2002;5(2-3):85--132.
  85. NIST Standard Reference Database 23: Reference Fluid Thermodynamic and Transport Properties-REFPROP, Version 10.0, National Institute of Standards and Technology. 2018. URL: https://www.nist.gov/srd/refprop. doi:https://doi.org/10.18434/T4/1502528.
  86. Continuous control with deep reinforcement learning. 2019. arXiv:1509.02971.
  87. Pressure control analysis of cryogenic storage systems. Journal of Propulsion and Power 1991;20. doi:10.2514/1.10387.
  88. Liu X, MacArt JF. Adjoint-based machine learning for active flow control. 2023. arXiv:2307.09980.
  89. Liu XY, Wang JX. Physics-informed dyna-style model-based deep reinforcement learning for dynamic control. 2021. doi:10.1098/rspa.2021.0618. arXiv:2108.00128.
  90. Ljung L. Perspectives on system identification. IFAC Proceedings Volumes 2008;41(2):7172--84. doi:10.3182/20080706-5-kr-1001.01215.
  91. Deep learning and system identification. IFAC-PapersOnLine 2020;URL: https://api.semanticscholar.org/CorpusID:226118683.
  92. Lorenc AC. Analysis methods for numerical weather prediction. Quarterly Journal of the Royal Meteorological Society 1986;112(474):1177--94. doi:10.1002/qj.49711247414.
  93. Comparison of hybrid-4denvar and hybrid-4dvar data assimilation methods for global NWP. Monthly Weather Review 2015;143(1):212--29. doi:10.1175/mwr-d-14-00195.1.
  94. A survey on model-based reinforcement learning. 2022a. URL: http://arxiv.org/abs/2206.09328. doi:10.48550/arXiv.2206.09328. arXiv:2206.09328 [cs].
  95. A survey on model-based reinforcement learning. 2022b. arXiv:2206.09328.
  96. Deep lagrangian networks: Using physics as model prior for deep learning. 2019. arXiv:1907.04490.
  97. Differentiable physics models for real-world offline model-based reinforcement learning. 2020. arXiv:2011.01734.
  98. Madhavan P. Recurrent neural network for time series prediction. In: Proceedings of the 15th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE; 1993. doi:10.1109/iembs.1993.978527.
  99. Real Time Data Assimilation for the Thermodynamic Modeling of a Cryogenic Fuel Tank. In: 36th International Conference on Efficiency, Cost, Optimization, Simulation and Environmental Impact of Energy Systems (ECOS 2023). Las Palmas De Gran Canaria, Spain: ECOS 2023; 2023. p. 1041--52. URL: http://www.proceedings.com/069564-0095.html. doi:10.52202/069564-0095.
  100. Optimal design of a thermodynamic vent system for cryogenic propellant storage. Cryogenics 2016a;80:127--37. doi:10.1016/j.cryogenics.2016.09.012.
  101. Active insulation technique applied to the experimental analysis of a thermodynamic control system for cryogenic propellant storage. Journal of Thermal Science and Engineering Applications 2016b;8(2). doi:10.1115/1.4032761.
  102. Playing atari with deep reinforcement learning. 2013a. arXiv:1312.5602.
  103. Playing atari with deep reinforcement learning. 2013b. arXiv:http://arxiv.org/abs/1312.5602v1.
  104. Human-level control through deep reinforcement learning. Nature 2015;518(7540):529--33. doi:10.1038/nature14236.
  105. Model-based reinforcement learning: A survey. 2022a. URL: http://arxiv.org/abs/2006.16712. doi:10.48550/arXiv.2006.16712. arXiv:2006.16712 [cs, stat].
  106. Model-based reinforcement learning: A survey. 2022b. arXiv:2006.16712.
  107. Moriarty PJ, Hansen AC. AeroDyn theory manual. Technical Report; National Renewable Energy Lab., Golden, CO (US); 2005.
  108. Cryogenic fluid management technologies for advanced green propulsion systems. In: 45th AIAA Aerospace Sciences Meeting and Exhibit. American Institute of Aeronautics and Astronautics; 2007. doi:10.2514/6.2007-343.
  109. A neural sir model for global forecasting. In: Alsentzer E, McDermott MBA, Falck F, Sarkar SK, Roy S, Hyland SL, editors. Proceedings of the Machine Learning for Health NeurIPS Workshop. PMLR; volume 136 of Proceedings of Machine Learning Research; 2020. p. 254--66. URL: https://proceedings.mlr.press/v136/nadler20a.html.
  110. Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. 2017. arXiv:1708.02596.
  111. Nelles O. Nonlinear System Identification. Springer Berlin Heidelberg, 2001. doi:10.1007/978-3-662-04323-3.
  112. Nicolao GD. System identification : Problems and perspectives. In: 12th Workshop on Qualitative Reasoning. 2003. URL: https://api.semanticscholar.org/CorpusID:26963940.
  113. Neural Networks for Modelling and Control of Dynamic Systems. Advanced Textbooks in Control and Signal Processing. Springer London, 2000.
  114. Overview of digital twin technology in wind turbine fault diagnosis and condition monitoring. 2021 IEEE 12th International Conference on Mechanical and Intelligent Manufacturing Technologies (ICMIMT) 2021;:201--7URL: https://api.semanticscholar.org/CorpusID:236190314.
  115. Ortega-Jiménez VM, Dudley R. Ascending flight and decelerating vertical glides in anna’s hummingbirds. Journal of Experimental Biology 2018;221(24):jeb191171.
  116. Pressure control of large cryogenic tanks in microgravity. 2004. URL: https://linkinghub.elsevier.com/retrieve/pii/S0011227504000633. doi:10.1016/j.cryogenics.2004.03.009.
  117. Panzarella CH, Kassemi M. On the validity of purely thermodynamic descriptions of two-phase cryogenic fluid storage. 2003. URL: http://www.journals.cambridge.org/abstract_S0022112003004002. doi:10.1017/S0022112003004002.
  118. Pao LY, Johnson KE. A tutorial on the dynamics and control of wind turbines and wind farms. In: 2009 American Control Conference. IEEE; 2009. doi:10.1109/acc.2009.5160195.
  119. Deep networks for system identification: a survey. 2023. arXiv:2301.12832.
  120. Development of a digital twin of an onshore wind turbine using monitoring data. Journal of Physics: Conference Series 2020;1618(2):022065. URL: https://dx.doi.org/10.1088/1742-6596/1618/2/022065. doi:10.1088/1742-6596/1618/2/022065.
  121. Comparative analysis of machine learning methods for active flow control. Journal of Fluid Mechanics 2023;958. doi:10.1017/jfm.2023.76.
  122. Pu Z, Kalnay E. Numerical weather prediction basics: Models, numerical methods, and data assimilation. In: Handbook of Hydrometeorological Ensemble Forecasting. Springer Berlin Heidelberg; 2018. p. 1--31. doi:10.1007/978-3-642-40457-3_11-1.
  123. Puterman ML. Markov Decision Processes. Wiley, 1994. doi:10.1002/9780470316887.
  124. Testing and comparison of a thermodynamic vent system operating in different modes in a liquid nitrogen tank. Applied Thermal Engineering 2021;197:117393. doi:10.1016/j.applthermaleng.2021.117393.
  125. Neural ordinary differential equations for nonlinear system identification. 2022. arXiv:2203.00120.
  126. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 2019;378:686--707.
  127. Ramesh A, Ravindran B. Physics-informed model-based reinforcement learning. 2023. arXiv:2212.02179.
  128. Digital Twin: Values, Challenges and Enablers. 2019. URL: http://arxiv.org/abs/1910.01719; arXiv:1910.01719 [eess].
  129. Rasmussen CE, Williams CKI. Gaussian Processes for Machine Learning. MIT Press Ltd, 2005.
  130. Introduction to data assimilation techniques and ensemble kalman filter. In: Advanced Numerical Modeling and Data Assimilation Techniques for Tropical Cyclone Prediction. Springer Netherlands; 2016. p. 307--30. doi:10.5822/978-94-024-0896-6_11.
  131. A parametric model for wind turbine power curves incorporating environmental conditions. Renewable Energy 2020;157:754--68. URL: https://www.sciencedirect.com/science/article/pii/S0960148120306613. doi:https://doi.org/10.1016/j.renene.2020.04.123.
  132. Salzman JA. Fluid management in space-based systems. In: Engineering, Construction, and Operations in Space V. American Society of Civil Engineers; 1996. doi:10.1061/40177(207)71.
  133. Sane SP. The aerodynamics of insect flight. Journal of experimental biology 2003;206(23):4191--208.
  134. Sane SP, Dickinson MH. The control of flight force by a flapping wing: lift and drag production. Journal of experimental biology 2001;204(15):2607--26.
  135. Sastry SS, Isidori A. Adaptive control of linearizable systems. IEEE Transactions on Automatic Control 1989;34(11):1123--31.
  136. Prioritized experience replay. 2015. doi:10.48550/ARXIV.1511.05952.
  137. Prioritized experience replay. 2016. arXiv:1511.05952.
  138. Schoukens J, Ljung L. Nonlinear system identification: A user-oriented road map. IEEE Control Systems 2019;39(6):28--99. doi:10.1109/mcs.2019.2938121.
  139. Review on model predictive control: an engineering perspective. Int J Adv Manuf Technol 2021;117(5-6):1327--49. URL: https://link.springer.com/10.1007/s00170-021-07682-3. doi:10.1007/s00170-021-07682-3.
  140. Mastering the game of go with deep neural networks and tree search. Nature 2016;529(7587):484--9.
  141. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 2018;362(6419):1140--4.
  142. Deterministic Policy Gradient Algorithms 2014;.
  143. Neural networks in system identification. IFAC Proceedings Volumes 1994;27(8):359--82. doi:10.1016/s1474-6670(17)47737-8.
  144. Smola AJ, Schölkopf B. A tutorial on support vector regression. Statistics and Computing 2004;14(3):199--222. doi:10.1023/b:stco.0000035301.49549.88.
  145. Staffell I, Green R. How does wind farm performance decline with age? Renewable Energy 2014;66:775--86. URL: https://www.sciencedirect.com/science/article/pii/S0960148113005727. doi:https://doi.org/10.1016/j.renene.2013.10.041.
  146. Stengel RF. Optimal control and estimation. Courier Corporation, 1994.
  147. Sutton RS. Dyna, an integrated architecture for learning, planning, and reacting. SIGART Bull 1991;2(4):160–163. URL: https://doi.org/10.1145/122344.122377. doi:10.1145/122344.122377.
  148. Sutton RS, Barto AG. Reinforcement learning: An introduction. MIT press, 2018.
  149. Artificial Neural Networks for Modelling and Control of Non-Linear Systems. Springer US, 1996. doi:10.1007/978-1-4757-2493-6.
  150. Szita I. Reinforcement learning in games. In: Adaptation, Learning, and Optimization. Springer Berlin Heidelberg; 2012. p. 539--77.
  151. Flight dynamics and control of flapping-wing mavs: a review. Nonlinear Dynamics 2012;70:907--39.
  152. Talagrand O, Courtier P. Variational assimilation of meteorological observations with the adjoint vorticity equation. i: Theory. Quarterly Journal of the Royal Meteorological Society 1987;113(478):1311--28. doi:10.1002/qj.49711347812.
  153. Tang Y, Hsieh WW. Coupling neural networks to incomplete dynamical systems via variational data assimilation. Monthly Weather Review 2001;129(4):818--34. doi:10.1175/1520-0493(2001)129<0818:cnntid>2.0.co;2.
  154. Tekinerdogan B. On the notion of digital twins: A modeling perspective. 2022. URL: https://www.mdpi.com/2079-8954/11/1/15. doi:10.3390/systems11010015.
  155. Survey on reinforcement learning for language processing. Artificial Intelligence Review 2022;56(2):1543--75. doi:10.1007/s10462-022-10205-5.
  156. Challenges and potentials of digital twins and industry 4.0 in product design and production for high performance products. Procedia CIRP 2019;84:88--93. doi:10.1016/j.procir.2019.04.219.
  157. Experimental study on pressure control of liquid nitrogen tank by thermodynamic vent system. Applied Thermal Engineering 2017;125:1037--46. URL: https://www.sciencedirect.com/science/article/pii/S1359431117315491. doi:https://doi.org/10.1016/j.applthermaleng.2017.07.067.
  158. Data assimilation and its applications. Proceedings of the National Academy of Sciences 2000;97(21):11143--4. doi:10.1073/pnas.97.21.11143.
  159. Imagination-augmented agents for deep reinforcement learning. 2017. arXiv:1707.06203.
  160. Werner S, Peitz S. Learning a model is paramount for sample efficiency in reinforcement learning control of pdes. 2023. arXiv:2302.07160.
  161. Whitney JP, Wood RJ. Aeromechanics of passive rotation in flapping flight. Journal of fluid mechanics 2010;660:197--220.
  162. Integrating scientific knowledge with machine learning for engineering and environmental systems. 2020. doi:10.48550/ARXIV.2003.04919.
  163. Wright L, Davidson S. How to tell the difference between a model and a digital twin. Advanced Modeling and Simulation in Engineering Sciences 2020;7(1). doi:10.1186/s40323-020-00147-4.
  164. Wing kinematics-based flight control strategy in insect-inspired flight systems: Deep reinforcement learning gives solutions and inspires controller design in flapping mavs. Biomimetics 2023;8(3):295.
  165. Zhang C, Moore KL. System identification using neural networks. [1991] Proceedings of the 30th IEEE Conference on Decision and Control 1991;:873--874 vol.1URL: https://api.semanticscholar.org/CorpusID:60798054.
  166. Zhang H, Constantinescu EM. Optimal checkpointing for adjoint multistage time-stepping schemes. Journal of Computational Science 2023;66:101913. doi:10.1016/j.jocs.2022.101913.
  167. Zheng X, Jin T. A reliable method of wind power fluctuation smoothing strategy based on multidimensional non-linear exponential smoothing short-term forecasting. IET Renewable Power Generation 2022;16(16):3573--86. doi:10.1049/rpg2.12395.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.