Papers
Topics
Authors
Recent
Search
2000 character limit reached

Contact Trust Region (CTR) Framework

Updated 12 June 2026
  • The Contact Trust Region (CTR) framework is a suite of algorithmic and geometric methods that explicitly incorporate unilateral and non-smooth contact constraints for advanced optimization and decision-making.
  • It combines local Taylor- or mirror-map approximations with specialized trust regions to maintain feasibility, supporting applications in dexterous robotic manipulation, nonlinear elasticity, and constrained reinforcement learning.
  • Empirical results show CTR’s ability to achieve low tracking errors and efficient convergence in both online MPC and offline roadmap-based planning, outperforming standard methods in contact-rich scenarios.

The Contact Trust Region (CTR) framework encompasses a suite of algorithmic and geometric tools for solving contact-rich decision and optimization problems. CTR methodologies generalize classical trust region techniques by explicitly incorporating the unilateral and often non-smooth constraints that arise in physical and decision-theoretic contact. Applications span dexterous robotic manipulation, nonlinear elasticity with contact, and safety-constrained reinforcement learning. Distinct variants include the contact-aware trust region for convex trajectory optimization in manipulation (Suh et al., 4 May 2025), filter–trust-region methods for large deformation mechanics (Youett et al., 2017), and constraint-barrier trust region policy updates in reinforcement learning (Milosevic et al., 2024). CTR techniques combine local Taylor- or mirror-map-based modeling with contact-specific feasibility domains, resulting in scalable, globally convergent solutions.

1. Contact Trust Region in Dexterous Manipulation

The Contact Trust Region as developed in "Dexterous Contact-Rich Manipulation via the Contact Trust Region" (Suh et al., 4 May 2025) provides a principled local approximation for contact dynamics in manipulation tasks. The underlying model is quasidynamic, with the configuration q∈Rnqq \in \mathbb{R}^{n_q} governed by an SOCP, capturing both actuated ("robot") and unactuated ("object") coordinates. At each decision step, the system evolves according to:

  • q+=f(q,u)q_+ = f(q, u), where uu is the commanded robot position.
  • Evolution is subject to velocity cone constraints Ji(q)q++ci(q)∈KiJ_i(q) q_+ + c_i(q) \in \mathcal{K}_i enforcing nonpenetration and friction.

Rather than employing a standard ellipsoidal trust region, CTR defines the region of trust as the set of (δq,δu)(\delta q, \delta u) for which first-order Taylor approximations of the next state q+q_+ and contact force λ\lambda are guaranteed to satisfy both primal (no penetration) and dual (friction cone) constraints:

  • SΣ,κ(qˉ,uˉ)={(δq,δu)∈EΣ∣\mathcal{S}_{\Sigma,\kappa}(\bar{q}, \bar{u}) = \{ (\delta q, \delta u) \in \mathcal{E}_\Sigma \mid linearized q+q_+ and λ+\lambda_+ feasible for all contactsq+=f(q,u)q_+ = f(q, u)0.

Two variants are distinguished: strict CTR, enforcing both primal and dual, and Relaxed CTR (R-CTR), relaxing the nonpenetration for numerical performance but maintaining dual (force) feasibility in the linearized contact cone.

2. CTR-Based Local Model Predictive Control (MPC) Formulation

CTR is integrated into a local, finite-horizon MPC scheme by solving a convex trajectory optimization problem with dynamics linearized around a nominal trajectory. The optimization:

  • Minimizes terminal deviation and control smoothness: q+=f(q,u)q_+ = f(q, u)1.
  • Subjects the first-order dynamics to the linearized contact model and R-CTR constraints at every time step.
  • The resulting subproblem is a sequence of SOCP stages that enforce dual-cone feasibility (or both primal and dual in the strict variant).

Typically, 1–3 sequential convexification steps suffice to converge, thanks to the stability imparted by the contact-aware trust region.

3. Algorithmic Realization of CTR-MPC

CTR-MPC algorithms involve two principal routines:

  • Offline trajectory optimization (CtrTrajOpt): Iteratively roll out the nonlinear contact model, solve the convex trust-region subproblem, and update the nominal trajectory.
  • Online MPC loop: At each step, warm-start the trajectory and solve for an optimized local control sequence, applying only the first control, then advancing the system and repeating.

An initial-guess heuristic, wherein a virtual torque is applied to establish contact, accelerates convergence when the system starts out-of-contact.

4. Global Planning via Roadmaps

While CTR-MPC provides high-fidelity local planning, it is complemented by a roadmap-based global strategy for contact-rich systems with complex mode transitions:

  • Construct an offline graph whose nodes correspond to stable object-robot contact configurations.
  • Edges encode feasible local transitions, determined by successful runs of CTR-MPC and collision-free arm repositioning.
  • Online, connect the start and goal to nearest roadmap nodes using MPC, then execute the path yielded by shortest-path search.

Roadmap construction for a high-DOF dexterous hand (e.g., AllegroHand, covering all 24 symmetries across 5 grasps and ≈100 edges) can be accomplished in under 10 minutes using CPU resources.

5. Empirical Performance in Manipulation

CTR-based planners have been evaluated both in simulation and hardware for planar (IiwaBimanual) and 3D (AllegroHand) systems:

  • Achieved local tracking errors of q+=f(q,u)q_+ = f(q, u)2 mm/2.1 mrad (R-CTR) and q+=f(q,u)q_+ = f(q, u)3 mm/8.9 mrad (CTR) for IiwaBimanual, q+=f(q,u)q_+ = f(q, u)4 mm/q+=f(q,u)q_+ = f(q, u)5 mrad for AllegroHand.
  • R-CTR outperforms both standard ellipsoidal trust region baselines and strict CTR in mean/variance.
  • Per MPC iteration wall-clock: q+=f(q,u)q_+ = f(q, u)6 ms (R-CTR, IiwaBimanual) to q+=f(q,u)q_+ = f(q, u)7 ms (CTR, AllegroHand).
  • Roadmap-based global plans are constructed offline in q+=f(q,u)q_+ = f(q, u)8 minutes, with online execution at several Hz using only CPU.
  • Compared to RL-based approaches requiring thousands of GPU-hours for training, CTR offers comparable hardware dexterity with orders-of-magnitude lower compute (Suh et al., 4 May 2025).

6. Filter–Trust-Region Methods in Hyperelastic Contact

In numerical mechanics, the filter–trust-region framework (Youett et al., 2017) solves large deformation contact problems by augmenting the quadratic model with a filter that balances energy decrease and infeasibility reduction. Contact is imposed via mortar discretization, which ensures numerical stability and avoids spurious oscillations. The trust-region subproblem is a QP with q+=f(q,u)q_+ = f(q, u)9 bounds, rapidly solved using Truncated Nonsmooth Newton Multigrid (TNNMG) with Monotone Multigrid correction for nonconvexities. Global convergence is guaranteed by filter-acceptance criteria and feasibility-restoration phases.

7. Constrained Trust Region Policy Optimization in Reinforcement Learning

In constrained reinforcement learning, the C-TRPO algorithm (Milosevic et al., 2024) introduces a trust region whose shape is governed by a composite mirror map including barrier terms for safety constraints:

  • The trust region consists only of policies strictly within the safety set, as the Bregman divergence blows up at constraint boundaries.
  • At each policy update, a quadratic subproblem is solved with the reward gradient subject to the barrier-augmented trust region.
  • The result is monotonic improvement in reward and drastic reduction in cumulative constraint violation compared to prior safe RL algorithms, while preserving computational efficiency.

C-TRPO recovers standard TRPO and Constrained Policy Optimization as special cases, offering safety invariance and global convergence to optimal safe policies.


References:

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Contact Trust Region (CTR) Framework.