- The paper introduces COSMIK-MPPI, combining sampling-based control with Constraints-as-Terminations to enforce collision-free trajectories in human-centric settings.
- It achieves consistent real-time performance (≈22 ms per planning loop) and scales effectively in dense, non-convex constraint environments compared to gradient-based MPC.
- Validation in both simulations and real-world experiments confirms robust, adaptive safety and reliable human–robot interaction under dynamic uncertainties.
COSMIK-MPPI: Constrained Model Predictive Collision Avoidance with Sampling-Based Terminations in Dynamic Human Environments
Introduction
The paper "COSMIK-MPPI: Scaling Constrained Model Predictive Control to Collision Avoidance in Close-Proximity Dynamic Human Environments" (2604.10358) advances predictive collision avoidance for robotic manipulators operating in shared workspaces with humans. Current methods in Model Predictive Control (MPC) face reliability and scalability issues in environments with dense, dynamic, and uncertain constraints, particularly when dealing with human motion. This work addresses both robustness and computational scalability through the integration of the Model Predictive Path Integral (MPPI) control with a constraint transcription framework based on Constraints-as-Terminations (CaT) and a real-time, markerless human motion estimation system (RT-COSMIK).
Limitations of Standard Approaches
Conventional Gradient-Based MPC (GB-MPC) achieves hard constraint satisfaction and provides guarantees on safety and feasibility, but suffers from several limitations in the context of human–robot interaction:
- Non-convex and dense constraint landscapes induce sensitivity to initial solution estimates, frequent local minima entrapment, and poor scalability as the number and complexity of collision constraints grow.
- Computational burden increases superlinearly with additional constraints, often breaching real-time requirements in highly constrained or rapidly evolving environments.
- Reliance on precise motion prediction for obstacles—especially humans—a requirement not easily met with noisy, markerless perception systems.
In contrast, sampling-based methods like standard MPPI are more scalable with respect to constraint complexity and allow for direct incorporation of environmental uncertainty. However, safety enforcement via additive penalty costs lacks formal guarantees; aggressive penalization can destabilize optimization and still remains inherently flawed due to the possibility of unsafe trajectories exploiting weakly enforced penalties.
The proposed COSMIK-MPPI method couples the following key innovations:
- Sampling-Based Prediction: MPPI is leveraged for stochastic trajectory sampling and online optimization, exhibiting computational performance largely decoupled from the number of constraints.
- Constraints-as-Terminations: Rather than using additive collision penalties, the controller treats constraint violation as a termination event during rollouts. This formulation biases the sampling distribution toward trajectories with low termination hazards, thereby strongly enforcing safety constraints in an expectation-maximizing fashion.
- Adaptive Normalization: The CaT formalism adaptively scales termination probabilities based on real-time observed violation magnitudes, obviating manual tuning, and improving robustness under varying task and perception conditions.
- Markerless Human Pose Estimation (RT-COSMIK): The perception module operates at 20 Hz with low-cost RGB cameras, utilizing Neural Localizer Fields for 3D continuous pose reconstruction and biomechanically consistent inverse kinematics. This approach tolerates occlusions and offers closed-form, real-time signed distance evaluation between articulated robot/human models via capsule representations.
Simulation Benchmarking
To rigorously compare COSMIK-MPPI with state-of-the-art GB-MPC and vanilla MPPI, the authors introduce six simulated collision avoidance scenarios of escalating complexity.
Figure 2: Six simulation scenarios evaluating collision avoidance under increasing geometric and constraint complexity; red and green dots indicate alternate end-effector targets.
Key results include:
- Across tractable scenarios (1–3), all methods achieve equivalent 100% success rates. GB-MPC, however, exhibits escalating computation times with added constraints, while MPPI-based methods remain consistently real time (21–22 ms per planning loop).
- In non-convex and highly constrained environments (scenarios 4–5), GB-MPC consistently fails—unable to deliver feasible trajectories in real time—whereas both MPPI variants succeed, with COSMIK-MPPI strictly favoring collision avoidance.
- In infeasible scenarios (6), COSMIK-MPPI and GB-MPC both avoid collisions but do not reach the goal, correctly producing conservative behavior. Vanilla MPPI, by contrast, is more likely to violate safety.
Importantly, the MPPI-based schemes introduce higher-frequency torque command variations due to their stochastic nature. While GB-MPC produces smoother solutions when feasible, its lack of scalability limits practical deployment for collaborative manipulation.
Real-World Experimental Validation
The system is validated in a series of pick-and-place experiments encompassing static and dynamic environments, physical perturbation, and close human–robot proximity with high constraint complexity.
Figure 4: Five experimental tasks evaluating COSMIK-MPPI: baseline, physical perturbation, human-aware avoidance, concurrent activity, and highly constrained human–robot interaction.
Notable behavioral observations and qualitative metrics are:
Implications and Future Directions
The COSMIK-MPPI approach demonstrates a highly practical marriage of sampling-based control, formal constraint transcriptions, and robust, markerless human sensing for collaborative robotics. The main empirical claims are:
- 100% task success and collision avoidance in feasible scenarios under constant computation times (22 ms), regardless of geometric or constraint complexity.
- Failure of GB-MPC in scaling to multiple, non-convex constraints, despite its smoothness and optimality advantages in simpler tasks.
- Safety bias is stronger and more reliable with CaT formulation compared to standard penalty approaches.
Practically, these results suggest that sampling-based MPC augmented with CaT is preferable in settings requiring real-time safety and accommodation of uncertain or dynamic obstacles—especially in human-centric environments where constraint landscapes are high-dimensional and continually shifting.
From a theoretical perspective, the explicit coupling of survival-weighted cost shaping and rollout termination within the sampling paradigm could inform further developments in safe reinforcement learning and adaptive motion planning. Extensions might readily integrate anticipatory models for human motion prediction as they improve in reliability, further advancing collaborative autonomy.
Conclusion
COSMIK-MPPI operationalizes robust predictive collision avoidance in dynamic, shared human–robot workspaces by integrating constraint-aware sampling, adaptive termination, and markerless human perception. The approach provides strong safety guarantees, computational efficiency invariant to constraint complexity, and reactivity tailored to rich, real-world manipulation scenarios. These advances mark a significant step for scalable, safe robot deployment in practical collaborative environments, and open avenues for scaling towards more complex, anticipatory, and interactive robot behaviors (2604.10358).