COSMIK-MPPI: Scaling Constrained Model Predictive Control to Collision Avoidance in Close-Proximity Dynamic Human Environments

Published 11 Apr 2026 in cs.RO | (2604.10358v1)

Abstract: Ensuring safe physical interaction between torque-controlled manipulators and humans is essential for deploying robots in everyday environments. Model Predictive Control (MPC) has emerged as a suitable framework thanks to its capacity to handle hard constraints, provide strong guarantees and zero-shot adaptability through predictive reasoning. However, Gradient-Based MPC (GB-MPC) solvers have demonstrated limited performance for collision avoidance in complex environments. Sampling-based approaches such as Model Predictive Path Integral (MPPI) control offer an alternative via stochastic rollouts, but enforcing safety via additive penalties is inherently fragile, as it provides no formal constraint satisfaction guarantees. We propose a collision avoidance framework called COSMIK-MPPI combining MPPI with the toolbox for human motion estimation RT-COSMIK and the Constraints-as-Terminations transcription, which enforces safety by treating constraint violations as terminal events, without relying on large penalty terms or explicit human motion prediction. The proposed approach is evaluated against state-of-the-art GB-MPC and vanilla MPPI in simulation and on a real manipulator arm. Results show that COSMIK-MPPI achieves a 100% task success rate with a constant computation time (22 ms), largely outperforming GB-MPC. In simulated infeasible scenarios, COSMIK-MPPI consistently generates collision-free trajectories, contrary to vanilla MPPI. These properties enabled safe execution of complex real-world human-robot interaction tasks in shared workspaces using an affordable markerless human motion estimator, demonstrating a robust, compliant, and practical solution for predictive collision avoidance (cf. results showcased at https://exquisite-parfait-ffa925.netlify.app)

Abstract PDF Upgrade to Chat

Authors (8)

Summary

The paper introduces COSMIK-MPPI, combining sampling-based control with Constraints-as-Terminations to enforce collision-free trajectories in human-centric settings.
It achieves consistent real-time performance (≈22 ms per planning loop) and scales effectively in dense, non-convex constraint environments compared to gradient-based MPC.
Validation in both simulations and real-world experiments confirms robust, adaptive safety and reliable human–robot interaction under dynamic uncertainties.

COSMIK-MPPI: Constrained Model Predictive Collision Avoidance with Sampling-Based Terminations in Dynamic Human Environments

Introduction

The paper "COSMIK-MPPI: Scaling Constrained Model Predictive Control to Collision Avoidance in Close-Proximity Dynamic Human Environments" (2604.10358) advances predictive collision avoidance for robotic manipulators operating in shared workspaces with humans. Current methods in Model Predictive Control (MPC) face reliability and scalability issues in environments with dense, dynamic, and uncertain constraints, particularly when dealing with human motion. This work addresses both robustness and computational scalability through the integration of the Model Predictive Path Integral (MPPI) control with a constraint transcription framework based on Constraints-as-Terminations (CaT) and a real-time, markerless human motion estimation system (RT-COSMIK).

Limitations of Standard Approaches

Conventional Gradient-Based MPC (GB-MPC) achieves hard constraint satisfaction and provides guarantees on safety and feasibility, but suffers from several limitations in the context of human–robot interaction:

Non-convex and dense constraint landscapes induce sensitivity to initial solution estimates, frequent local minima entrapment, and poor scalability as the number and complexity of collision constraints grow.
Computational burden increases superlinearly with additional constraints, often breaching real-time requirements in highly constrained or rapidly evolving environments.
Reliance on precise motion prediction for obstacles—especially humans—a requirement not easily met with noisy, markerless perception systems.

In contrast, sampling-based methods like standard MPPI are more scalable with respect to constraint complexity and allow for direct incorporation of environmental uncertainty. However, safety enforcement via additive penalty costs lacks formal guarantees; aggressive penalization can destabilize optimization and still remains inherently flawed due to the possibility of unsafe trajectories exploiting weakly enforced penalties.

COSMIK-MPPI Formulation

The proposed COSMIK-MPPI method couples the following key innovations:

Sampling-Based Prediction: MPPI is leveraged for stochastic trajectory sampling and online optimization, exhibiting computational performance largely decoupled from the number of constraints.
Constraints-as-Terminations: Rather than using additive collision penalties, the controller treats constraint violation as a termination event during rollouts. This formulation biases the sampling distribution toward trajectories with low termination hazards, thereby strongly enforcing safety constraints in an expectation-maximizing fashion.
Adaptive Normalization: The CaT formalism adaptively scales termination probabilities based on real-time observed violation magnitudes, obviating manual tuning, and improving robustness under varying task and perception conditions.
Markerless Human Pose Estimation (RT-COSMIK): The perception module operates at 20 Hz with low-cost RGB cameras, utilizing Neural Localizer Fields for 3D continuous pose reconstruction and biomechanically consistent inverse kinematics. This approach tolerates occlusions and offers closed-form, real-time signed distance evaluation between articulated robot/human models via capsule representations.

Simulation Benchmarking

To rigorously compare COSMIK-MPPI with state-of-the-art GB-MPC and vanilla MPPI, the authors introduce six simulated collision avoidance scenarios of escalating complexity.

Figure 2: Six simulation scenarios evaluating collision avoidance under increasing geometric and constraint complexity; red and green dots indicate alternate end-effector targets.

Key results include:

Across tractable scenarios (1–3), all methods achieve equivalent 100% success rates. GB-MPC, however, exhibits escalating computation times with added constraints, while MPPI-based methods remain consistently real time (21–22 ms per planning loop).
In non-convex and highly constrained environments (scenarios 4–5), GB-MPC consistently fails—unable to deliver feasible trajectories in real time—whereas both MPPI variants succeed, with COSMIK-MPPI strictly favoring collision avoidance.
In infeasible scenarios (6), COSMIK-MPPI and GB-MPC both avoid collisions but do not reach the goal, correctly producing conservative behavior. Vanilla MPPI, by contrast, is more likely to violate safety.

Importantly, the MPPI-based schemes introduce higher-frequency torque command variations due to their stochastic nature. While GB-MPC produces smoother solutions when feasible, its lack of scalability limits practical deployment for collaborative manipulation.

Real-World Experimental Validation

The system is validated in a series of pick-and-place experiments encompassing static and dynamic environments, physical perturbation, and close human–robot proximity with high constraint complexity.

Figure 4: Five experimental tasks evaluating COSMIK-MPPI: baseline, physical perturbation, human-aware avoidance, concurrent activity, and highly constrained human–robot interaction.

Notable behavioral observations and qualitative metrics are:

In static and lightly dynamic settings, the robot executes smooth, rapid, collision-free motions.
Under physical perturbation, compliant low-level control enables safe, gradual recovery and continuous online trajectory replanning.
With dynamic human motion and highly constrained geometries (e.g., human forms a closed arm loop), COSMIK-MPPI naturally trades off traversal speed for safety, consistently maintaining minimal but non-zero separation distances to humans and objects.
The robot demonstrates reactivity and adaptation, with successful avoidance and completion of tasks in environments with unpredictable, occluded, or rapidly changing human movement.
Figure 1: Analysis of minimum robot–environment/human distances and end-effector velocity in a highly constrained setting; velocity reduction aligns with proximity, evidencing safety prioritization.

Implications and Future Directions

The COSMIK-MPPI approach demonstrates a highly practical marriage of sampling-based control, formal constraint transcriptions, and robust, markerless human sensing for collaborative robotics. The main empirical claims are:

100% task success and collision avoidance in feasible scenarios under constant computation times (22 ms), regardless of geometric or constraint complexity.
Failure of GB-MPC in scaling to multiple, non-convex constraints, despite its smoothness and optimality advantages in simpler tasks.
Safety bias is stronger and more reliable with CaT formulation compared to standard penalty approaches.

Practically, these results suggest that sampling-based MPC augmented with CaT is preferable in settings requiring real-time safety and accommodation of uncertain or dynamic obstacles—especially in human-centric environments where constraint landscapes are high-dimensional and continually shifting.

From a theoretical perspective, the explicit coupling of survival-weighted cost shaping and rollout termination within the sampling paradigm could inform further developments in safe reinforcement learning and adaptive motion planning. Extensions might readily integrate anticipatory models for human motion prediction as they improve in reliability, further advancing collaborative autonomy.

Conclusion

COSMIK-MPPI operationalizes robust predictive collision avoidance in dynamic, shared human–robot workspaces by integrating constraint-aware sampling, adaptive termination, and markerless human perception. The approach provides strong safety guarantees, computational efficiency invariant to constraint complexity, and reactivity tailored to rich, real-world manipulation scenarios. These advances mark a significant step for scalable, safe robot deployment in practical collaborative environments, and open avenues for scaling towards more complex, anticipatory, and interactive robot behaviors (2604.10358).

Markdown Report Issue