Papers
Topics
Authors
Recent
Search
2000 character limit reached

Trajectory-based actuator identification via differentiable simulation

Published 11 Apr 2026 in cs.RO | (2604.10351v2)

Abstract: Accurate actuation models are critical for bridging the gap between simulation and real robot behavior, yet obtaining high-fidelity actuator dynamics typically requires dedicated test stands and torque sensing. We present a trajectory-based actuator identification method that uses differentiable simulation to fit system-level actuator models from encoder motion alone. Identification is posed as a trajectory-matching problem: given commanded joint positions and measured joint angles and velocities, we optimize actuator and simulator parameters by backpropagating through the simulator, without torque sensors, current/voltage measurements, or access to embedded motor-control internals. The framework supports multiple model classes, ranging from compact structured parameterizations to neural actuator mappings, within a unified optimization pipeline. On held-out real-robot trajectories for a high-gear-ratio actuator with an embedded PD controller, the proposed torque-sensor-free identification achieves much tighter trajectory alignment than a supervised stand-trained baseline dominated by steady-state data, reducing mean absolute position error from 14.20 mrad to as low as 7.54 mrad (1.88 times). Finally, we demonstrate downstream impact for the same actuator class in a real-robot locomotion study: training policies with the refined actuator model increases travel distance by 46% and reduces rotational deviation by 75% relative to the baseline.

Summary

  • The paper introduces a trajectory-matching identification method that uses gradient-based optimization in differentiable simulation to align simulated and real actuator trajectories.
  • It demonstrates significant reduction in joint position errors and enhances sim-to-real reinforcement learning by accurately fitting both parametric and neural actuator models.
  • Experimental results show that improved actuator identification halves the MAE compared to traditional test-stand baselines, leading to robust real-world performance.

Trajectory-Based Actuator Identification via Differentiable Simulation: A Technical Analysis

Introduction and Context

Effective system identification of robotic actuators is fundamental to accurate physics simulation for control and reinforcement learning. Traditional approaches often rely on either simplified actuator abstractions or detailed hardware-based test-stand characterization, but each exhibits deficiencies for sim-to-real transfer. The discussed work introduces a trajectory-matching identification methodology using differentiable simulation. This framework poses identification as a parameter optimization task, leveraging gradient-based techniques to align simulated and real-world trajectories, and operates without requiring direct torque or current sensing. The method is validated on high-gear-ratio actuators with embedded PD control and further evaluated for its impact on downstream RL locomotion performance.

Methodological Framework

The trajectory-based actuator identification is formulated as follows. A parameterized simulator Φz\Phi_z generates predicted state trajectories si′s'_i for a given sequence of control inputs aia_i from initial states s0s_0. The unknown actuator and simulation parameters z∗z^* are estimated by minimizing the cumulative trajectory discrepancy between real and simulated rollouts:

Lbatch(z)=1MN∑j=0M−1∑i=1N∥W(si,j′−si,j)∥22\mathcal{L}_\text{batch}(z) = \frac{1}{MN} \sum_{j=0}^{M-1} \sum_{i=1}^{N} \|W(s'_{i,j} - s_{i,j})\|_2^2

where WW is a diagonal weighting matrix penalizing discrepancies in joint positions and velocities. The optimization employs gradient-based solvers enabled by differentiable simulators (specifically MJX, a JAX-based MuJoCo variant), making the approach compatible with both compact parametric actuator models and high-capacity neural mappings.

The sensor data requirements are minimized: only encoder-based joint positions and velocities, along with control commands, are necessary. The framework supports actuator parameterizations ranging from structured PD/armature modeling to expressive neural network torque maps and even per-timestep free torque "oracle" sequences to upper bound achievable trajectory fit. Figure 1

Figure 1: Illustration of trajectory-matching optimization: the real system follows trajectory {si}\{s_i\} under control inputs {ai}\{a_i\}; the simulator, parameterized by zz, generates predicted states si′s'_i0, and the error at each step is minimized.

Experimental Validation and Baseline Comparisons

Validation of the framework targets closed-loop fidelity for a high-gear-ratio actuator (210:1) with nontrivial embedded control. The identification pipeline is benchmarked against multiple baselines:

  • Bench-Sup: An MLP-based supervised model trained on torque-labeled test-stand data.
  • Param-ES: A gradient-free evolution strategy optimizer (sMAES) for parametric actuator parameters.
  • Residual-RL: A neural residual policy in the spirit of ASAP-style sim-to-real correction.
  • TrajID-Param/TrajID-NN: Differentiable simulation-based parameter/numerical network regressors (proposed).
  • Torque-Oracle: An unconstrained per-timestep torque sequence upper bound.

Quantitative assessment is performed by aligning models with measured hardware rollouts under identical control inputs, and MAE in joint position is reported. Figure 2

Figure 2: Measured (black dashed) versus simulated (colored) motor velocities under identical control inputs, for each actuator model.

Figure 3

Figure 3: Position MAE for each actuator model, revealing substantial gains for trajectory-matching methods over test-stand baselines.

Notable numerical results:

  • Bench-Sup: si′s'_i1 mrad MAE.
  • TrajID-NN: si′s'_i2 mrad MAE.
  • Residual-RL: si′s'_i3 mrad MAE.
  • TrajID-Param: si′s'_i4 mrad MAE.
  • Torque-Oracle: si′s'_i5 mrad MAE.

Trajectory-matching methods halve the position MAE versus test-stand-trained baselines. Notably, gradient-based optimization scales efficiently to high-dimensional actuator models, whereas gradient-free methods are only tractable for small parameter sets.

Optimization Stability, Sensitivity, and Design Choices

The proposed identification shows strong repeatability. In 25 independent runs, estimated PD and armature parameters have negligible variance, with consistently stable convergence. Figure 4

Figure 4: Parameter, loss, and gradient norm convergence for TrajID-Param over 25 runs, indicating high repeatability.

Objective sensitivity analyses demonstrate that the compact parametric models are robust to loss weighting and prediction horizon choices. In contrast, flexible neural models are more susceptible to overfitting on noisy velocity data and can accumulate integration errors over longer horizons. Figure 5

Figure 5: Validation MAE versus the position/velocity weighting in the loss for different model architectures.

Figure 6

Figure 6: Validation MAE versus prediction horizon, illustrating horizon-robustness of parametric models and more variable neural/horizon-dependent behavior for flexible models.

Downstream Robotics Application: RL Policy Transfer

To assess practical implications, the refined actuator models are deployed in a sim-to-real RL workflow for quadrupedal locomotion (miniPi robot). Policies trained with default versus identified actuator parameters are transferred to hardware and evaluated for forward travel distance and rotational alignment. Figure 7

Figure 7: Real-robot trajectory comparisons for baseline and refined policies; refined model leads to increased distance and improved alignment.

Mean outcomes over 10 trials:

Metric Baseline TrajID Model
Rotation (deg) si′s'_i6 si′s'_i7
Distance (m) si′s'_i8 si′s'_i9

Policy trained with the trajectory-identified actuator achieved 46% greater travel and 75% lower rotational drift under identical deployment conditions.

Practical and Theoretical Implications

This approach systematically closes the actuation gap in sim-to-real transfer workflows without additional instrumentation. By fitting actuator dynamics at the trajectory level, including transients, the method avoids pitfalls of steady-state-only test-stand calibration and circumvents the need for torque/current sensor access. Its compatibility with both interpretable parametric and high-capacity neural models enables flexible trade-offs between physical insight and representational power.

Limitations include imperfect coverage of unmodeled actuator nonlinearities (e.g., temperature effects, supply sag, hysteresis), dependence on excitation richness in identification data, and challenges for generalization beyond the tested actuator class. Yet, the demonstrated simulation fidelity gains and real-robot RL transfer improvements confirm clear downstream value.

Future Directions

Several avenues remain for broadening applicability and enhancing robustness:

  • Extension to multi-joint, contact-rich platforms and new actuation modalities.
  • Integration of recurrent and temporal architectures to capture hysteresis, delay, and long-term dependencies.
  • Formal robustness and identifiability analyses, with expanded validation across excitation and task regimes.
  • Incorporation of regularization strategies and meta-learning to further improve generalization and reduce overfitting risks.

Conclusion

The trajectory-based differentiable simulation identification method delivers quantifiable improvements over both test-stand and residual RL baselines for actuator modeling, yielding superior sim-to-real transfer performance in practical legged robotics tasks. The presented results highlight the relevance of full trajectory-level calibration for high-fidelity simulation and robust control pipeline deployment. The approach provides a scalable foundation for future system identification research and sim-to-real RL applications in robotic systems with restricted measurement access.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.