Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Effective control of two-dimensional Rayleigh--Bénard convection: invariant multi-agent reinforcement learning is all you need (2304.02370v2)

Published 5 Apr 2023 in physics.flu-dyn and cs.LG

Abstract: Rayleigh-B\'enard convection (RBC) is a recurrent phenomenon in several industrial and geoscience flows and a well-studied system from a fundamental fluid-mechanics viewpoint. However, controlling RBC, for example by modulating the spatial distribution of the bottom-plate heating in the canonical RBC configuration, remains a challenging topic for classical control-theory methods. In the present work, we apply deep reinforcement learning (DRL) for controlling RBC. We show that effective RBC control can be obtained by leveraging invariant multi-agent reinforcement learning (MARL), which takes advantage of the locality and translational invariance inherent to RBC flows inside wide channels. The MARL framework applied to RBC allows for an increase in the number of control segments without encountering the curse of dimensionality that would result from a naive increase in the DRL action-size dimension. This is made possible by the MARL ability for re-using the knowledge generated in different parts of the RBC domain. We show in a case study that MARL DRL is able to discover an advanced control strategy that destabilizes the spontaneous RBC double-cell pattern, changes the topology of RBC by coalescing adjacent convection cells, and actively controls the resulting coalesced cell to bring it to a new stable configuration. This modified flow configuration results in reduced convective heat transfer, which is beneficial in several industrial processes. Therefore, our work both shows the potential of MARL DRL for controlling large RBC systems, as well as demonstrates the possibility for DRL to discover strategies that move the RBC configuration between different topological configurations, yielding desirable heat-transfer characteristics. These results are useful for both gaining further understanding of the intrinsic properties of RBC, as well as for developing industrial applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (86)
  1. J. Rabault and A. Kuhnle, “Deep reinforcement learning applied to active flow control,” in Data-Driven Fluid Mechanics: Combining First Principles and Machine Learning, edited by M. A. Mendez, A. Ianiro, B. R. Noack,  and S. L. Brunton (Cambridge University Press, 2023) p. 368–390.
  2. V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra,  and M. Riedmiller, “Human-level control through deep reinforcement learning.” Nature 518 (2015).
  3. Y. LeCun, Y. Bengio,  and G. Hinton, “Deep learning,” Nature 521, 436–444 (2015).
  4. R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction (MIT press, 2018).
  5. Y. Li, “Deep reinforcement learning: An overview,” arXiv preprint arXiv:1701.07274  (2017).
  6. H. Van Hasselt, A. Guez,  and D. Silver, “Deep reinforcement learning with double q-learning,” in Proceedings of the AAAI conference on artificial intelligence, Vol. 30 (2016).
  7. V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra,  and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602  (2013).
  8. D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra,  and M. Riedmiller, “Deterministic policy gradient algorithms,” in International conference on machine learning (Pmlr, 2014) pp. 387–395.
  9. J. Schulman, F. Wolski, P. Dhariwal, A. Radford,  and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347  (2017).
  10. O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev, et al., “Grandmaster level in starcraft ii using multi-agent reinforcement learning,” Nature 575, 350–354 (2019).
  11. J. Degrave, F. Felici, J. Buchli, M. Neunert, B. Tracey, F. Carpanese, T. Ewalds, R. Hafner, A. Abdolmaleki, D. de Las Casas, et al., “Magnetic control of tokamak plasmas through deep reinforcement learning,” Nature 602, 414–419 (2022).
  12. T. Sonoda, Z. Liu, T. Itoh,  and Y. Hasegawa, “Reinforcement learning of control strategies for reducing skin friction drag in a fully developed channel flow,” arXiv preprint arXiv:2206.15355  (2022).
  13. L. Guastoni, J. Rabault, P. Schlatter, H. Azizpour,  and R. Vinuesa, “Deep reinforcement learning for turbulent drag reduction in channel flows,” Eur. J. Phys., To Appear. Preprint arXiv:2301.09889  (2023).
  14. M. Chevalier, Adjoint based control and optimization of aerodynamic flows, Ph.D. thesis, Mekanik (2002).
  15. J. Rabault, M. Kuchta, A. Jensen, U. Réglade,  and N. Cerardi, “Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control,” Journal of Fluid Mechanics 865, 281–302 (2019).
  16. H. Xu, W. Zhang, J. Deng,  and J. Rabault, “Active flow control with rotating cylinders by an artificial neural network trained by deep reinforcement learning,” Journal of Hydrodynamics 32, 254–258 (2020).
  17. S. Qin, S. Wang, J. Rabault,  and G. Sun, “An application of data driven reward of deep reinforcement learning by dynamic mode decomposition in active flow control,” arXiv preprint arXiv:2106.06176  (2021).
  18. M. Tokarev, E. Palkin,  and R. Mullyadzhanov, “Deep reinforcement learning control of cylinder flow using rotary oscillations at low reynolds number,” Energies 13, 5920 (2020).
  19. J. Li and M. Zhang, “Reinforcement-learning-based control of confined cylinder wakes with stability analyses,” Journal of Fluid Mechanics 932, A44 (2022).
  20. F. Ren, J. Rabault,  and H. Tang, “Applying deep reinforcement learning to active flow control in weakly turbulent conditions,” Physics of Fluids 33, 037121 (2021).
  21. P. Varela, P. Suárez, F. Alcántara-Ávila, A. Miró, J. Rabault, B. Font, L. M. García-Cuevas, O. Lehmkuhl,  and R. Vinuesa, “Deep reinforcement learning for flow control exploits different physics for increasing reynolds number regimes,” in Actuators, Vol. 11 (2022) p. 359.
  22. G. Beintema, A. Corbetta, L. Biferale,  and F. Toschi, “Controlling Rayleigh–Bénard convection via reinforcement learning,” Journal of Turbulence 21, 585–605 (2020).
  23. V. Belus, J. Rabault, J. Viquerat, Z. Che, E. Hachem,  and U. Reglade, “Exploiting locality and translational invariance to design effective deep reinforcement learning control of the 1-dimensional unstable falling liquid film,” AIP Advances 9, 125014 (2019).
  24. Y.-Z. Wang, Y.-F. Mei, N. Aubry, Z. Chen, P. Wu,  and W.-T. Wu, “Deep reinforcement learning based synthetic jet control on disturbed flow over airfoil,” Physics of Fluids 34, 033606 (2022a).
  25. R. Vinuesa, O. Lehmkuhl, A. Lozano-Durán,  and J. Rabault, “Flow control in wings and discovery of novel approaches via deep reinforcement learning,” Fluids 7, 62 (2022).
  26. F. Ren, C. Wang,  and H. Tang, “Active control of vortex-induced vibration of a circular cylinder using machine learning,” Physics of Fluids 31, 093601 (2019).
  27. F. Ren, C. Wang,  and H. Tang, “Bluff body uses deep-reinforcement-learning trained active flow control to achieve hydrodynamic stealth,” Physics of Fluids 33, 093602 (2021).
  28. C. Jiang-Li, C. Shao-Qiang, R. Feng,  and H. Hai-Bao, “Artificially intelligent control of drag reduction around acircular cylinder based on wall pressure feedback,” Acta Physica Sinica 71 (2022).
  29. A. Corrochano and S. Le Clainche, “Structural sensitivity in non-linear flows using direct solutions,” Computers & Mathematics with Applications 128, 69–78 (2022).
  30. D. Fan, L. Yang, Z. Wang, M. S. Triantafyllou,  and G. E. Karniadakis, “Reinforcement learning for bluff body active flow control in experiments and simulations,” Proceedings of the National Academy of Sciences 117, 26091–26098 (2020).
  31. J. Rabault and A. Kuhnle, “Accelerating deep reinforcement learning strategies of flow control through a multi-environment approach,” Physics of Fluids 31, 094105 (2019).
  32. H. Tang, J. Rabault, A. Kuhnle, Y. Wang,  and T. Wang, “Robust active flow control over a range of reynolds numbers using an artificial neural network trained through deep reinforcement learning,” Physics of Fluids 32, 053605 (2020).
  33. M. A. Bucci, O. Semeraro, A. Allauzen, G. Wisniewski, L. Cordier,  and L. Mathelin, “Control of chaotic systems by deep reinforcement learning,” Proceedings of the Royal Society A 475, 20190351 (2019).
  34. D. Xu and M. Zhang, “Reinforcement-learning-based control of convectively unstable flows,” Journal of Fluid Mechanics 954, A37 (2023).
  35. R. Paris, S. Beneddine,  and J. Dandois, “Robust flow control and optimal sensor placement using deep reinforcement learning,” Journal of Fluid Mechanics 913, A25 (2021).
  36. R. Paris, S. Beneddine,  and J. Dandois, “Reinforcement-learning-based actuator selection method for active flow control,” Journal of Fluid Mechanics 955, A8 (2023).
  37. J. Rabault, F. Ren, W. Zhang, H. Tang,  and H. Xu, “Deep reinforcement learning in fluid mechanics: A promising method for both active flow control and shape optimization,” Journal of Hydrodynamics 32, 234–246 (2020).
  38. P. Garnier, J. Viquerat, J. Rabault, A. Larcher, A. Kuhnle,  and E. Hachem, “A review on deep reinforcement learning for fluid mechanics,” Computers & Fluids 225, 104973 (2021).
  39. C. Vignon, J. Rabault,  and R. Vinuesa, “Recent advances in applying deep reinforcement learning for flow control: Perspectives and future directions,” Physics of Fluids 35, 031301 (2023).
  40. Y. Matsuo, Y. LeCun, M. Sahani, D. Precup, D. Silver, M. Sugiyama, E. Uchibe,  and J. Morimoto, “Deep learning, reinforcement learning, and world models,” Neural Networks  (2022).
  41. A. Kuhnle, M. Schaarschmidt,  and K. Fricke, “Tensorforce: a tensorflow library for applied reinforcement learning,” Web page (2017).
  42. A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus,  and N. Dormann, “Stable-baselines3: Reliable reinforcement learning implementations,” The Journal of Machine Learning Research 22, 12348–12355 (2021).
  43. J. Weng, H. Chen, D. Yan, K. You, A. Duburcq, M. Zhang, Y. Su, H. Su,  and J. Zhu, “Tianshou: A highly modularized deep reinforcement learning library,” arXiv preprint arXiv:2107.14171  (2021).
  44. Q. Wang, L. Yan, G. Hu, C. Li, Y. Xiao, H. Xiong, J. Rabault,  and B. R. Noack, “Drlinfluids: An open-source python platform of coupling deep reinforcement learning and openfoam,” Physics of Fluids 34, 081801 (2022b).
  45. M. Kurz, P. Offenhäuser, D. Viola, O. Shcherbakov, M. Resch,  and A. Beck, “Deep reinforcement learning for computational fluid dynamics on hpc systems,” Journal of Computational Science 65, 101884 (2022a).
  46. M. Kurz, P. Offenhäuser, D. Viola, M. Resch,  and A. Beck, “Relexi—a scalable open source reinforcement learning framework for high-performance computing,” Software Impacts 14, 100422 (2022b).
  47. S. L. Brunton and B. R. Noack, “Closed-loop turbulence control: Progress and challenges,” Applied Mechanics Reviews 67 (2015).
  48. B. R. Noack, “Closed-loop turbulence control-from human to machine learning (and retour),” in Fluid-Structure-Sound Interactions and Control: Proceedings of the 4th Symposium on Fluid-Structure-Sound Interactions and Control (Springer, 2019) pp. 23–32.
  49. A. Debien, K. A. von Krbek, N. Mazellier, T. Duriez, L. Cordier, B. R. Noack, M. W. Abel,  and A. Kourta, “Closed-loop separation control over a sharp edge ramp using genetic programming,” Experiments in Fluids 57, 1–19 (2016).
  50. R. Li, B. R. Noack, L. Cordier, J. Borée,  and F. Harambat, “Drag reduction of a car model by linear genetic programming control,” Experiments in Fluids 58, 1–20 (2017).
  51. A. B. Blanchard, G. Y. Cornejo Maceda, D. Fan, Y. Li, Y. Zhou, B. R. Noack,  and T. P. Sapsis, “Bayesian optimization for active flow control,” Acta Mechanica Sinica , 1–13 (2021).
  52. J.-L. Aider, “Closed-loop separation control using machine learning,” Journal of Fluid Mechanics 770, 442–457 (2015).
  53. Y. Zhou, D. Fan, B. Zhang, R. Li,  and B. R. Noack, “Artificial intelligence control of a turbulent jet,” Journal of Fluid Mechanics 897, A27 (2020).
  54. “Machine learning strategies applied to the control of a fluidic pinball,” Physics of Fluids 32, 015108 (2020).
  55. G. Y. C. Maceda, Y. Li, F. Lusseyran, M. Morzyński,  and B. R. Noack, “Stabilization of the fluidic pinball with gradient-enriched machine learning control,” Journal of Fluid Mechanics 917, A42 (2021).
  56. A. V. Getling, Rayleigh-benard Convection: Structures And Dynamics (World Scientific Publishing Co Pte Ltd, 1998).
  57. J. Tang and H. H. Bau, “Stabilization of the no-motion state in the rayleigh-benard problem,” Proceedings: Mathematical and Physical Sciences 447, 587–607 (1994a).
  58. S. H. Davis, “The stability of time-periodic flows,” Annual Review of Fluid Mechanics 8, 57–74 (1976).
  59. R. J. Donnelly, “Externally modulated hydrodynamic systems,” in Nonlinear Evolution of Spatio-Temporal Structures in Dissipative Continuous Systems, edited by F. H. Busse and L. Kramer (1990) pp. 31–43.
  60. R. E. Kelly, “Stabilization of rayleigh–bénard convection by means of a slow nonplanar oscillatory flow,” Physics of Fluids A: Fluid Dynamics 4, 647 (1992).
  61. R. M. Carbo, R. W. M. Smith,  and M. E. Poese, “A computational model for the dynamic stabilization of rayleigh-bénard convection in a cubic cavity,” The Journal of the Acoustical Society of America 135, 654–68 (2014).
  62. A. Swaminathan, S. L. Garrett, M. E. Poese,  and R. W. M. Smith, “Dynamic stabilization of the rayleigh-bénard instability by acceleration modulation,” The Journal of the Acoustical Society of America 144, 2334 (2018).
  63. J. Singer and H. H. Bau, “Active control of convection,” Physics of Fluids A: Fluid Dynamics 3, 2859 (1991).
  64. Y. Z. Wang, J. Singer,  and H. H. Bau, “Controlling chaos in a thermal convection loop,” J. Fluid Mech. 237, 479 (1992).
  65. J. Tang and H. H. Bau, “Stabilization of the no-motion state in rayleigh-bénard convection through the use of feedback control,” Phys. Rev. Lett. 70, 1795–1798 (1993a).
  66. J. Tang and H. H. Bau, “Feedback control stabilization of the no-motion state of a fluid confined in a horizontal porous layer heated from below,” Journal of Fluid Mechanics 257, 485 (1993b).
  67. J. Tang and H. H. Bau, “Stabilization of the no-motion state in the rayleigh–bénard problem,” Proc. R. Soc. Lond. A 447, 587–607 (1994b).
  68. J. Tang and H. H. Bau, “Stabilization of the no-motion state of a horizontal fluid layer heated from below with joule heating,” Journal of Heat Transfer 117, 329–333 (1995a).
  69. J. Tang and H. H. Bau, “Stabilization of the no-motion state of a horizontal fluid layer heated from below with joule heating,” Journal of Heat Transfer 117 (1995b), 10.1115/1.2822525.
  70. T. A. Shortis and P. Hall, ‘‘On the effect of feedback control on benard convection in a boussinesq fluid,” NASA Contractor Report 198280, 96–6 (1996).
  71. L. E. Howle, “Active control of rayleigh–bénard convection,” Physics of Fluids 9, 1861 (1997a).
  72. L. E. Howle, “Linear stability analysis of controlled rayleigh-bénard convection using shadowgraphic measurement,” Physics of Fluids 9, 3111 (1997b).
  73. J. Tang and H. H. Bau, “Experiments on the stabilization of the no-motion state of a fluid layer heated from below and cooled from above,” Journal of Fluid Mechanics 363, 153–171 (1998a).
  74. J. Tang and H. H. Bau, “Numerical investigation of the stabilization of the no-motion state of a fluid layer heated from below and cooled from above,” Physics of Fluids 10, 1597 (1998b).
  75. L. E. Howle, “The effect of boundary properties on controlled rayleigh–bénard convection,” Journal of Fluid Mechanics 411, 39–58 (2000).
  76. A. C. Or, L. Cortelezzi,  and J. L. Speyer, “Robust feedback control of rayleigh–bénard convection,” Journal of Fluid Mechanics 437, 175–202 (2001).
  77. A. C. Or and J. L. Speyer, “Active suppression of finite-amplitude rayleigh–bénard convection,” Journal of Fluid Mechanics 483, 111–128 (2003).
  78. A. C. Or and J. L. Speyer, ‘‘Gain-scheduled controller for the suppression of convection at high rayleigh number,” Phys. Rev. E 71, 046302 (2005).
  79. M. C. Remillieux, H. Zhao,  and H. H. Bau, “Suppression of Rayleigh-Bénard convection with proportional-derivative controller,” Physics of Fluids 19, 017102 (2007).
  80. A. Pandey, J. D. Scheel,  and J. Schumacher, “Turbulent superstructures in Rayleigh-Bénard convection,” Nature Communications 9, 2118 (2018).
  81. J. Kim, P. Moin,  and R. Moser, “Turbulence statistics in fully developed channel flow at low reynolds number,” Journal of Fluid Mechanics 177, 133–166 (1987).
  82. U. M. Ascher, S. J. Ruuth,  and R. J. Spiteri, “Implicit-explicit runge-kutta methods for time-dependent partial differential equations,” Applied Numerical Mathematics 25, 151–167 (1997).
  83. M. Mortensen, “Shenfun: High performance spectral Galerkin computing platform,” Journal of Open Source Software 3, 1071 (2018).
  84. M. Mortensen, “Shenfun’s documentation : https://shenfun.readthedocs.io,” Web page.
  85. S. C. Reddy, P. J. Schmid,  and D. S. Henningson, “Pseudospectra of the Orr–Sommerfeld operator,” SIAM Journal on Applied Mathematics 53, 15–47 (1993).
  86. P. J. Schmid and D. S. Henningson, “Optimal energy density growth in Hagen–Poiseuille flow,” Journal of Fluid Mechanics 277, 197–225 (1994).
Citations (32)

Summary

  • The paper introduces invariant MARL to overcome the curse of dimensionality in controlling two-dimensional Rayleigh–Bénard convection.
  • It segments the convection domain into pseudo-environments to localize control actions, effectively reducing the Nusselt number.
  • Comparative results show MARL outperforms single-agent methods, offering scalable strategies for complex fluid dynamics.

Effective Control of Two-Dimensional Rayleigh--Bénard Convection with Invariant Multi-Agent Reinforcement Learning

The paper focuses on applying deep reinforcement learning (DRL) to control two-dimensional Rayleigh--Bénard convection (RBC), a canonical problem in fluid dynamics. RBC is pivotal in several industrial and geoscientific contexts, characterized by complex convective heat transfer processes. This paper leverages invariant multi-agent reinforcement learning (MARL) to elucidate efficient control strategies for RBC, especially in wider channels with multiple convection cells, addressing challenges like the curse of dimensionality in control spaces.

MARL Approach and Main Findings

The research adopts the MARL paradigm to manage RBC inside domains with substantial aspect ratios and periodic boundary conditions. This approach contrasts with conventional single-agent reinforcement learning (SARL), which struggles with the dimensionality issue due to simultaneous control of multiple actuators.

  • MARL Efficacy: The MARL framework, through exploiting the inherent translational invariance of RBC, enables effective RBC control by segmenting the RBC domain into pseudo-environments. This segmentation allows localized control actions that collectively enhance RBC management without burdening the learning process with unnecessary complexity.
  • Control Strategy Discovery: The MARL approach discovers intricate control strategies that destabilize the spontaneous RBC cell patterns, prompting coalescence of adjacent cells. Once coalesced, these singular cells reach a stable configuration with reduced convective heat transfer. This reduction in the Nusselt number demonstrates enhanced control effectiveness, translating to industrial benefits where heat-transfer minimization is advantageous.
  • Comparison with SARL: The SARL struggles to achieve significant control policy learning within the same timeline, due to its inability to handle the expanded control action space efficiently. This underscores MARL's potential in complex fluid control scenarios, where multi-agent frameworks significantly accelerate learning and strategy optimization.

Practical and Theoretical Implications

The paper's findings have profound implications:

  • Industrial Applications: The ability to control and optimize heat-transfer properties of RBC has direct applications in improving thermal management systems. Industries reliant on thermal regulation can leverage the paper's outcomes to enhance energy efficiency.
  • Future Research Directions: The success of MARL in this simplified RBC model sets the stage for future exploration in more complex and three-dimensional configurations. This could eventually lead to more effective control mechanisms in turbulent flows, not just for RBC but across various fluid dynamic systems.
  • Theoretical Contributions: On a theoretical level, the paper validates MARL's potential to overcome the curse of dimensionality in fluid control problems. It exemplifies how localized control through quasi-independent pseudo-environments can be synthesized into a holistic control mechanism.

Conclusion

The paper demonstrates a substantive advancement in fluid-control methodologies using neural-network-based frameworks, specifically tailored for complex, high-dimensional systems like RBC. The paper highlights how leveraging system symmetries and invariants with multi-agent approaches like MARL can lead to efficient and scalable control solutions. This examination of RBC control via DRL and MARL not only enriches the fluid mechanics discipline but also sets a template for future studies aiming to harness artificial intelligence for dynamic system optimization. The research also signifies a step forward in bridging machine learning techniques with traditional fluid dynamics, reinforcing interdisciplinary synergies aimed at solving classical and contemporary engineering challenges.

Youtube Logo Streamline Icon: https://streamlinehq.com