Aerial Gym Simulator: A Framework for Highly Parallelized Simulation of Aerial Robots (2503.01471v1)

Published 3 Mar 2025 in cs.RO

Abstract: This paper contributes the Aerial Gym Simulator, a highly parallelized, modular framework for simulation and rendering of arbitrary multirotor platforms based on NVIDIA Isaac Gym. Aerial Gym supports the simulation of under-, fully- and over-actuated multirotors offering parallelized geometric controllers, alongside a custom GPU-accelerated rendering framework for ray-casting capable of capturing depth, segmentation and vertex-level annotations from the environment. Multiple examples for key tasks, such as depth-based navigation through reinforcement learning are provided. The comprehensive set of tools developed within the framework makes it a powerful resource for research on learning for control, planning, and navigation using state information as well as exteroceptive sensor observations. Extensive simulation studies are conducted and successful sim2real transfer of trained policies is demonstrated. The Aerial Gym Simulator is open-sourced at: https://github.com/ntnu-arl/aerial_gym_simulator.

Summary

The paper introduces a GPU-accelerated, highly parallelized simulator for multirotor robots that integrates DRL frameworks with versatile control interfaces.
It details a modular architecture featuring a shared Global Tensor Dictionary and custom controllers for position, attitude, and body-wrench tasks.
Benchmarking against open-source simulators demonstrates improved physics and rendering performance, with successful real-world DRL policy transfers.

This paper introduces the Aerial Gym Simulator, a modular framework that uses NVIDIA Isaac Gym for the parallelized simulation of multirotor platforms. The simulator supports under-, fully-, and over-actuated multirotors, and provides parallelized geometric controllers alongside a custom GPU (Graphics Processing Unit) accelerated rendering framework. This framework is designed for ray-casting to capture depth, segmentation, and vertex-level annotations from the environment.

Key features of the Aerial Gym Simulator include:

A parallelized simulation framework for simulating multirotor platforms.
A control suite with interfaces for different airframes, ranging from position setpoints to RPM control with simulated motor dynamics.
A GPU-based rendering framework for creating and updating parallel rendering environments that can be randomized and modified at runtime.
Integration with DRL (Deep Reinforcement Learning) frameworks, utilizing ready-to-use controllers and accelerated rendering.

The paper presents the architecture of the Aerial Gym Simulator, which includes managers for the rendering engine, robots, and environments, as well as interfaces for the physics engine and common DRL frameworks. All components share a common memory bank called the GTD (Global Tensor Dictionary) and perform in-place operations on tensors. The simulator obtains states and joint information from the Isaac Gym physics engine and updates the GTD at each physics step. Geometric controllers use robot states from the GTD to provide forces and torques to the robot's actuators. Task definitions are constructed conforming to the Gymnasium API and provide task-specific interpretations of the information stored in the GTD.

The simulator supports the simulation of various airframe configurations, including non-planar airframes and multi-linked systems. Users can specify the number of motors, joint parameters, and sensors for an embodiment defined with a Universal Robot Description Format (URDF) file. The simulator can handle complex meshes and convex decompositions for simulating collisions. Arbitrary $n$ -motor configurations can be simulated using a robot configuration file. The simulator supports simulation of airframes with both active and passive rotational joints. Each joint can be interfaced with a proportional derivative (PD)-controller provided by Isaac Gym or with a custom implementation.

The simulator is equipped with controllers across levels of abstraction. Adapted geometric controllers are provided, with suboptimal performance on non-planar configurations. The controllers employ PyTorch's JIT-compiled modules for faster execution on the GPU. The following controllers and interfaces are provided:

Attitude-Thrust and Angular Rate-Thrust Controllers: The errors in desired orientation $\mathbf{e_R}$ $e_{R}$ and body rates $\mathbf{e}_{\boldsymbol{\Omega}}$ $e_{Ω}$ , alongside the resulting desired body-torque $\mathbf{M}$ $M$ are computed as:

$\mathbf{e_R} = \frac{1}{2} (\mathbf{R}_d^T \mathbf{R} - \mathbf{R}^T \mathbf{R}_d)^{\vee},$

$\mathbf{e}_{\boldsymbol{\Omega}} = \boldsymbol{\Omega} - \mathbf{R}^T \mathbf{R}_d \boldsymbol{\Omega}_d,$

$\mathbf{M} = -\mathbf{k_R} \mathbf{e_R} - \mathbf{k}_{\boldsymbol{\Omega}} \mathbf{e}_{\boldsymbol{\Omega}} + \boldsymbol{\Omega} \times \mathbf{J} \boldsymbol{\Omega}$

where
- $\mathbf{e_R}$ is the error in desired orientation
- $\mathbf{e}_{\boldsymbol{\Omega}}$ is the error in body rates
- $\mathbf{M}$ is the desired body-torque expressed in the body-fixed frame $\mathcal{B}$
- $\mathbf{R}$ and $\mathbf{R}_d$ denote the current and desired orientation
- ${\boldsymbol{\Omega}}$ and ${\boldsymbol{\Omega}_d}$ denote current and desired angular rates of the robot, all expressed in $\mathcal{B}$
- $\boldsymbol{k_R,k_{\boldsymbol{\Omega}}$ are adequate weights
- $\mathbf{J}$ is the robot moment of inertia
- $\cdot^{\vee}$ is the vee-map
Position, Velocity and Acceleration Controllers: The desired body-torque $\mathbf{M}$ $M$ is calculated as above and the thrust command $\mathbf{f}$ $f$ is calculated as:

$\mathbf{f} = (\mathbf{k_x} \mathbf{e_x} + \mathbf{k_v} \mathbf{e_v} + m g \mathbf{e}_3 - m \mathbf{\ddot{x}_d}) \cdot \mathbf{R}\mathbf{e}_3$

where
- $\mathbf{e_x}$ and $\mathbf{e_v}$ denote the position and velocity errors in the inertial frame $\mathcal{W}$
- $\mathbf{k_x}$ and $\mathbf{k_v}$ denote the respective weights
- $g$ is the magnitude of acceleration due to gravity
- $m$ is the robot mass
- $\ddot{\mathbf{x}_d}$ is the desired robot acceleration in $\mathcal{W}$
- $\mathbf{e}_3$ is a unit vector in the z-direction
Body-wrench Controller: The desired or externally-commanded body-wrench is represented as $\mathbf{W} = [{f}_x, {f}_y, {f}_z, {M}_x, {M}_y, {M}_z]^T$ . For an $n$ -rotor system, the thrust command provided to the actuators is $\mathbf{U} = [u_1, u_2, \dots, u_n]^T$ . The control-effectiveness matrix $\mathbf{B}$ relates $\mathbf{U}$ and $\mathbf{W}$ as:

$\mathbf{W} = \mathbf{B}\mathbf{U},$

$\mathbf{U}^\textrm{ref} = \mathbf{B}^+ {\mathbf{W}^\textrm{ref}}$
RPM and Thrust Setpoint Interfaces: The simulator provides functionalities to choose between the interface to command the motors via either thrust or RPM setpoints.

The Aerial Gym Simulator provides simulated sensors, including exteroceptive sensors and an IMU sensor. The exteroceptive sensors implementation complements the default rendering engine of NVIDIA Isaac Gym and overcomes its limitations using NVIDIA Warp. Standalone kernels for ray-casting are developed and integrated with the simulator. The measured acceleration $\mathbf{a}_{\textrm{meas}$ and angular velocity $\boldsymbol{\Omega}_{\textrm{meas}$ about the sensor are calculated as:

$\mathbf{a}_{\textrm{meas}, t} = \mathbf{a}_{\textrm{true}, t} + \mathbf{b}_{\mathbf{a},t} + \mathbf{n_a},$

$\boldsymbol{\Omega}_{\textrm{meas}, t} = \boldsymbol{\Omega}_{\textrm{true}, t} + \mathbf{b}_{\boldsymbol{\Omega}, t} + \mathbf{n}_{\boldsymbol{\Omega}},$

$\mathbf{b}_{\mathbf{a},t} = \mathbf{b}_{\mathbf{a},t-1} + \mathbf{z}_{\mathbf{b}_{\mathbf{a}}},$

$\mathbf{b}_{\boldsymbol{\Omega},t} = \mathbf{b}_{\boldsymbol{\Omega},t-1} + \mathbf{z}_{\mathbf{b}_{\boldsymbol{\Omega}}},$

where

$\mathbf{a}_{\textrm{true}, t} = \mathbf{R}_t^T ((\mathbf{F}_{\textrm{true}, t} / m) + g\mathbf{e_3})$
$\mathbf{F}_{\textrm{true}, t}$ is the true net force expressed in $\mathcal{W}$
$\boldsymbol{\Omega}_{\textrm{true}, t}$ is the angular rate experienced by the simulated robot at time $t$ expressed in $\mathcal{B}$
$\mathbf{b}_{\mathbf{a},t}$ and $\mathbf{b}_{\boldsymbol{\Omega},t}$ are the biases
$\mathbf{n}_{\mathbf{a}, t}$ and $\mathbf{n}_{\boldsymbol{\Omega}, t}$ are the noises for accelerometer and gyroscope
$\mathcal{N}$ is a Gaussian distribution
$\boldsymbol{\sigma}_\mathbf{a}$ and $\boldsymbol{\sigma}_{\boldsymbol{\Omega}}$ are standard deviations
$\mathbf{z}_{\mathbf{b}_{\mathbf{a}}}$ and $\mathbf{z}_{\mathbf{b}_{\boldsymbol{\Omega}}}$ are the bias-drift terms
$\boldsymbol{\sigma}_{\mathbf{b_a}}$ and $\boldsymbol{\sigma}_{\mathbf{b}_{\boldsymbol{\Omega}}}$ are the standard deviations of the discrete-time random-walk model for the accelerometer and gyroscope

The Aerial Gym Simulator integrates learning frameworks such as RL Games, Sample Factory and Clean RL. Interfaces are provided to train policies for multirotor control and exteroceptive sensor-based navigation. Environments conforming to the Gymnasium standard are provided for control and navigation tasks for aerial platforms at varying levels of control abstraction.

The paper includes benchmarking studies against open-source simulators, comparing physics simulation speeds and rendering throughput. The Aerial Gym Simulator is benchmarked against gym-pybullet-drones, OmniDrones, and the simulator in Learning to Fly in Seconds (L2F). The comparison is performed by commanding constant RPM setpoints to quadrotors in obstacle-free environments. Rendering performance is measured for simulators in similar environments with cube obstacles.

Experimental evaluations demonstrate the performance of the simulator for training policies for various levels of control abstraction and sensor configurations. The RL (Reinforcement Learning) Games framework is used to train policies for high-level control tasks, while Sample Factory is used to train policies for low-level control and vision-based navigation tasks. The trained policies are deployed on real-world robots without additional fine-tuning. State-based DRL policies are trained to control the position of a quadrotor by commanding 3-D velocity and acceleration commands, alongside yaw rate commands. A policy is trained for fast, map-free navigation of cluttered environments, using the latent space from the Deep Collision Encoder (DCE). The paper also presents results from a motor control for position-setpoint tracking task.

PDF Markdown

Related Papers

GitHub

GitHub - ntnu-arl/aerial_gym_simulator: Aerial Gym Simulator - Isaac Gym Simulator for Aerial Robots (415 stars)

Tweets

https://twitter.com/arlteam/status/1897621376380367250