Emergent Mind

Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning

Published Feb 8, 2024 in cs.RO and cs.LG


Recent advances in real-world applications of reinforcement learning (RL) have relied on the ability to accurately simulate systems at scale. However, domains such as fluid dynamical systems exhibit complex dynamic phenomena that are hard to simulate at high integration rates, limiting the direct application of modern deep RL algorithms to often expensive or safety critical hardware. In this work, we introduce "Box o Flows", a novel benchtop experimental control system for systematically evaluating RL algorithms in dynamic real-world scenarios. We describe the key components of the Box o Flows, and through a series of experiments demonstrate how state-of-the-art model-free RL algorithms can synthesize a variety of complex behaviors via simple reward specifications. Furthermore, we explore the role of offline RL in data-efficient hypothesis testing by reusing past experiences. We believe that the insights gained from this preliminary study and the availability of systems like the Box o Flows support the way forward for developing systematic RL algorithms that can be generally applied to complex, dynamical systems. Supplementary material and videos of experiments are available at https://sites.google.com/view/box-o-flows/home.


  • The paper introduces 'Box o’ Flows,' a fluid-dynamic control system designed to test and refine Reinforcement Learning (RL) algorithms in real-world conditions.

  • Utilizes a novel setup of nine upward-facing nozzles to create a dynamically changing environment to control table tennis balls with airflow.

  • Employs Maximum A-posteriori Policy Optimization (MPO), a model-free RL algorithm, to learn control policies through direct interaction and offline data analysis.

  • Demonstrates RL's potential in mastering complex control tasks in real-world fluid dynamics through experiments with the 'Box o’ Flows' system.

Introduction to "Box o’ Flows"

Reinforcement Learning (RL) has shown promise in navigating complex dynamical systems, where traditional algorithmic approaches may falter due to the intricate nature of real-world environments. Despite the success in simulated environments, transferring these achievements to physical systems presents significant challenges, primarily due to the high fidelity required to model such systems accurately. This paper introduces "Box o’ Flows," an experimental fluid-dynamic control system designed to test and refine RL algorithms under real-world conditions that involve dynamic and unpredictable fluid movements. The system aims to bridge the gap between theoretical RL achievements and practical, real-world applications by providing a platform for developing and testing RL algorithms in an environment that closely mimics the complex dynamics of real-world scenarios.

"Box o’ Flows" System Overview

The "Box o’ Flows" setup is meticulously designed with nine upward-facing nozzles controlled by a proportional pneumatic valve to manipulate airflow. This setup creates a complex, dynamically changing environment to control the movement of rigid objects, like table tennis balls, placed within it. The system operates with a shared air supply among the valves, introducing intentional cross-coupling to add complexity to the control problem. A high-speed camera captures the motion of objects within the box, providing data for the RL algorithms to process and learn from. This innovative setup challenges the RL algorithms with a problem that is difficult to simulate accurately due to the intricate dynamics of fluid flow and the interaction of multiple objects.

Methodological Approach

The research employs Maximum A-posteriori Policy Optimization (MPO), a state-of-the-art, model-free, sample-efficient RL algorithm, to learn control policies. The methodology addresses the challenges of high-dimensional continuous control tasks and tests the algorithm's ability to learn dynamic behaviors with minimal prior knowledge and without relying on simulated environments. It demonstrates the capacity for RL algorithms to adapt to complex real-world dynamics through online interactions and offline data analysis. Offline RL is particularly emphasized for its potential to efficiently utilize past experiences and explore the feasibility of new hypotheses without the need for continuous real-world data collection.

Empirical Results

The team conducted several experiments showcasing the RL agent learning various dynamic behaviors, such as hovering, rearrangement, and stacking of objects, purely through interaction with the "Box o’ Flows" system. These experiments illustrated the agent's ability to adapt and derive efficient strategies for controlling the state of objects in a highly dynamic and unpredictable environment. The results signify the potential of RL algorithms in mastering complex control tasks in fluid dynamics, moving beyond the confines of simulated environments.

Implications and Future Directions

This paper's findings underscore the practicality and versatility of RL in real-world applications, particularly in domains traditionally considered challenging due to their dynamic nature. The "Box o’ Flows" system provides a valuable testbed for advancing RL research, offering insights into algorithmic performance and system dynamics that are difficult to obtain through simulation alone. Future work may explore model-based RL approaches and more sophisticated offline RL techniques to further enhance understanding and improve data efficiency. This research opens avenues for developing more robust and versatile RL algorithms capable of navigating the complexities of real-world phenomena.

Concluding Thoughts

"Box o’ Flows" represents a significant step towards bridging the gap between theoretical algorithm development and practical application in dynamic systems. By presenting a novel platform for testing RL algorithms in an environment that mimics real-world complexity, this work lays the groundwork for future advancements in RL research and its application across various domains involving fluid dynamics and beyond.

Get summaries of trending AI/ML papers delivered straight to your inbox

Unsubscribe anytime.