Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

119 tokens/sec

GPT-4o

56 tokens/sec

Gemini 2.5 Pro Pro

43 tokens/sec

o3 Pro

6 tokens/sec

GPT-4.1 Pro

47 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

150 130 1

Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning (2405.02425v1)

Published 3 May 2024 in cs.RO and cs.AI

Abstract: We apply multi-agent deep reinforcement learning (RL) to train end-to-end robot soccer policies with fully onboard computation and sensing via egocentric RGB vision. This setting reflects many challenges of real-world robotics, including active perception, agile full-body control, and long-horizon planning in a dynamic, partially-observable, multi-agent domain. We rely on large-scale, simulation-based data generation to obtain complex behaviors from egocentric vision which can be successfully transferred to physical robots using low-cost sensors. To achieve adequate visual realism, our simulation combines rigid-body physics with learned, realistic rendering via multiple Neural Radiance Fields (NeRFs). We combine teacher-based multi-agent RL and cross-experiment data reuse to enable the discovery of sophisticated soccer strategies. We analyze active-perception behaviors including object tracking and ball seeking that emerge when simply optimizing perception-agnostic soccer play. The agents display equivalent levels of performance and agility as policies with access to privileged, ground-truth state. To our knowledge, this paper constitutes a first demonstration of end-to-end training for multi-agent robot soccer, mapping raw pixel observations to joint-level actions, that can be deployed in the real world. Videos of the game-play and analyses can be seen on our website https://sites.google.com/view/vision-soccer .

References (80)

Authors (16)

Dhruva Tirumala (15 papers)
Markus Wulfmeier (46 papers)
Ben Moran (9 papers)
Sandy Huang (5 papers)
Jan Humplik (15 papers)
Guy Lever (18 papers)
Tuomas Haarnoja (16 papers)
Leonard Hasenclever (33 papers)
Arunkumar Byravan (27 papers)
Nathan Batchelor (5 papers)
Neil Sreendra (2 papers)
Kushal Patel (9 papers)
Marlon Gwira (2 papers)
Francesco Nori (51 papers)
Martin Riedmiller (64 papers)
Nicolas Heess (139 papers)

Citations (5)

View on Semantic Scholar

Summary

End-to-End Reinforcement Learning for Onboard Vision-Based Robot Soccer

Introduction

In an intriguing departure from traditional robot soccer approaches, researchers have developed a training pipeline that leverages multi-agent reinforcement learning (RL) to equip robots for soccer solely using their onboard sensors for vision. This means the robots are navigating the soccer field using nothing more than a camera fixed on their heads and some internal sensors like an IMU and joint encoders.

The Core of the Training Method

The key to achieving functionalities such as autonomous navigation and strategy execution in a dynamic environment lies in the innovative use of egocentric RGB vision. Here's a breakdown of the major components involved:

Simulation and Neural Radiance Fields (NeRFs): The team uses a sophisticated simulation environment where the physical soccer setting is closely replicated using NeRFs. This involves capturing hundreds of photos of the scene to create a realistic, static 3D model that can be observed from any viewpoint. Dynamic objects like the soccer ball and opponent robots are added to this static scene during simulation.
Reinforcement Learning Pipeline: Initially, separate 'expert' agents are trained for specific tasks like getting up from the ground or scoring against a stationary opponent. These experts are then distilled into a more general agent capable of handling the full complexity of a soccer match.
Memory and Active Vision: Instead of relying only on static views, agents use memory-driven strategies (via LSTM layers) to remember and anticipate the position of dynamic objects like the ball and opponents. Interestingly, this includes anticipating motion and future positions even when objects go out of sight temporarily.

How Do the Robots Perform?

The robots demonstrate impressive soccer skills:

Agility and Strategy Execution: They can walk, turn, and execute kicks with high agility, rivaling agents trained with more direct access to game state information.
Emergent Behavior: Without specific instructions to do so, robots start displaying behaviors like actively seeking the ball, positioning themselves strategically against opponents, and even blocking shots, purely as a function of the overarching goal to play soccer well.
Object Tracking and Active Vision: A striking finding is how well these robots can track the ball and gauge their relative position on the field using their onboard camera, despite the limitations imposed by a low-resolution input.

What Does This Mean for the Future of Robotics?

The implications of this research go beyond just playing soccer. It points towards broader possibilities in robotics applications where reliance on expensive or impractical external sensors is not feasible. For instance, search and rescue robots or autonomous delivery drones operating in complex, dynamically changing environments could benefit significantly from such technology.

Looking Ahead: Challenges and Opportunities

Despite the success, translating simulated training into real-world scenarios remains challenging. Differences such as lighting conditions, unexpected physical interactions (like a robot bumping into another), and hardware limitations (like camera quality or processing power) can reduce performance or require additional tweaks.

Conclusion

The journey of using end-to-end reinforcement learning for enabling robots to play soccer using only onboard vision is both challenging and fascinating. As research progresses, the integration of more realistic simulation techniques and advanced machine learning models promises even more robust and versatile robotic capabilities for real-world applications.

PDF Markdown

Tweets

https://twitter.com/m_wulfmeier/status/1836755471648075901

https://twitter.com/fly51fly/status/1787676929447059942

https://twitter.com/agi2025/status/1787666867215347927

https://twitter.com/qedgs/status/1791866987829719269

https://twitter.com/m_wulfmeier/status/1829610162849792363

https://twitter.com/RoboReading/status/1789442395026984979

YouTube

Show All Videos

Google DeepMind: "We introduce a system for vision-based multi-agent soccer via end-to-end RL that transfers zeroshot to real robots. (130 points, 14 comments)