From Demonstrations to Safe Deployment: Path-Consistent Safety Filtering for Diffusion Policies (2511.06385v1)

Published 9 Nov 2025 in cs.RO and eess.SY

Abstract: Diffusion policies (DPs) achieve state-of-the-art performance on complex manipulation tasks by learning from large-scale demonstration datasets, often spanning multiple embodiments and environments. However, they cannot guarantee safe behavior, so external safety mechanisms are needed. These, however, alter actions in ways unseen during training, causing unpredictable behavior and performance degradation. To address these problems, we propose path-consistent safety filtering (PACS) for DPs. Our approach performs path-consistent braking on a trajectory computed from the sequence of generated actions. In this way, we keep execution consistent with the policy's training distribution, maintaining the learned, task-completing behavior. To enable a real-time deployment and handle uncertainties, we verify safety using set-based reachability analysis. Our experimental evaluation in simulation and on three challenging real-world human-robot interaction tasks shows that PACS (a) provides formal safety guarantees in dynamic environments, (b) preserves task success rates, and (c) outperforms reactive safety approaches, such as control barrier functions, by up to 68% in terms of task success. Videos are available at our project website: https://tum-lsy.github.io/pacs/.

Summary

The paper introduces PACS, a framework that applies path-consistent, reachability-based safety filtering to diffusion policies for robotic manipulation.
It transforms high-level action chunks into kinematically feasible trajectories to ensure real-time constraint enforcement without deviating from learned distributions.
Empirical results show PACS achieves high task success and outperforms traditional control barrier functions in dynamic human-robot interaction tasks.

Path-Consistent Safety Filtering for Diffusion Policies: A Framework for Safe and Robust Deployment

Introduction

The integration of Diffusion Policies (DPs) into robotic manipulation tasks has brought measurable improvements in task performance, particularly when leveraging large-scale demonstration datasets. Despite these advances, DPs remain black-box models that cannot provide formal guarantees for safe behavior, making their deployment in human-centric and safety-critical environments risky. Reactive safety mechanisms—most notably, control barrier functions (CBFs) and other post-hoc safety filters—are commonly used to mitigate this risk. However, such interventions often drive the robotic agent into out-of-distribution (OOD) states, thereby degrading performance and undermining safety guarantees.

Figure 1: Deploying DPs in dynamic environments with moving objects requires safeguarding mechanisms, as the intended policy actions may be unsafe. Reactive strategies, such as control barrier functions, often drive the agent into OOD states not seen during training, leading to unpredictable behavior. PACS is proposed to keep safety interventions path-consistent, avoiding OOD states and enhancing task success rates.

To address these limitations, this work introduces the Path-Consistent Safety Filter (PACS), a reachability-based safety mechanism tailored for DPs that ensures safe operation by enforcing interventions directly along the robot’s intended trajectory, thus maintaining distributional consistency and high task success rates.

System Design and Methodology

Overview of PACS

PACS operates by transforming high-level action chunks, generated by DPs or vision-language-action models, into sequences of waypoints and constructing an intended, kinematically and dynamically feasible trajectory. The system then continuously applies reachability-based safety filtering at high frequency, ensuring that constraints—such as collision avoidance or bounded kinetic energy—are enforced in real time without deviating from the learned action manifold.

Figure 2: System overview of PACS. The policy generates action chunks which are converted into waypoints and intended trajectory. PACS applies high-frequency, reachability-based safety filtering to uphold constraints (e.g., collision avoidance, impact force limits).

Key to PACS’s generality is its modular structure: the policy remains agnostic to the presence of the safety filter, while the safety filter operates independently of the policy’s internal mechanisms. Trajectory monitoring is performed using set-based reachability analysis, which robustly encapsulates uncertainties in environment perception and robot tracking accuracy.

Path-Consistency and Trajectory Generation

A defining feature of PACS lies in its path-consistency principle: safety interventions (slowing, stopping, braking) are executed strictly along the trajectory implied by the DP’s action sequence. This contrasts with traditional reactive controllers, like CBFs, which perturb the robot state laterally—potentially bringing the policy into OOD states where learned behavior is unreliable.

The trajectory optimization within PACS exploits time-optimal and jerk-limited motion planning to ensure the executed motions respect robot kinematic and dynamic constraints. This planning is carried out in joint space, with action chunks interpreted as sequences of desired joint displacements. By reconstructing feasible motion from these chunks, PACS guarantees that interventions never violate the sampled demonstration data distribution, fostering robust task execution even under frequent safety interventions.

Real-World Evaluation and Empirical Results

To validate PACS, extensive experiments were conducted both in simulation (robomimic environments) and with a physical manipulator performing tasks in close human proximity.

Figure 3: Visualization of the three real-world tasks used for evaluation: Sorting (coexistence, collision-free), Handover (collaborative, low-force hand contact), and Feeding (collaborative, low-force mouth contact).

Task Analysis

Three tasks serve as benchmarks:

Sorting: A pure coexistence scenario, requiring strict collision avoidance with a human operator.
Handover: A collaborative manipulation where non-harmful physical contact with a human hand is permitted within energy thresholds.
Feeding: Robot delivers food to a human, requiring extremely fine safety control (with tight kinetic energy bounds).

Performance Metrics

Empirical results demonstrate:

PACS provides formal safety guarantees in both simulation and hardware, with no observed constraint violations in any physical rollout when active.
Task success rates with PACS are statistically indistinguishable from the unsafe, unfiltered policy for most tasks (≤5% difference), directly contradicting the typical trade-off observed with conventional safety filters.
PACS outperforms control barrier functions by 37% in hardware and 68% in simulation on the Sorting task, both in terms of task completion rate and the absence of OOD failures.
The execution speed with PACS is not compromised unless safety violations would otherwise occur. In some cases, path-consistent trajectory generation even produces faster executions than unfiltered running due to optimized motion feasibility checks.
Figure 4: In the Sorting task, PACS keeps motion on-distribution, slowing down when the human is near but not veering from the intended path. In contrast, CBF safety filtering causes multiple OOD departures, resulting in unrecoverable failures and greatly reduced success rates.

Theoretical and Practical Implications

By aligning safety enforcement mechanisms with the data distribution underlying the learned DP, PACS enables the simultaneous realization of formal safety and high task performance in dynamic human-robot interaction (HRI) settings. Notably, the strategy of enforcing path-consistency via trajectory-level, reachability-based analysis departs from many popular approaches that inject safety awareness into the policy at training or during denoising. Such online, model-agnostic post-processing circumvents the need for retraining, fine-tuning, or policy adaptation to dynamic environments.

PACS’s demonstrated scalability—real-time operation at 1 kHz control frequency—exemplifies its suitability for safety-critical deployments (e.g., robotic healthcare, industrial HRI). Furthermore, the path-consistent intervention paradigm can, in principle, be extended to a wide class of chunk-based generative policies, including large vision-language-action models.

Limitations and Prospective Directions

Despite its merits, PACS requires precise modeling of both the robot’s dynamics and the potential motion of dynamic obstacles. Conservative assumptions can lead to overly cautious behavior, while model inaccuracies may compromise guarantee strength. Additionally, PACS currently operates under the assumption that object motion remains within pre-specified bounds.

Future research avenues include:

Adaptive estimation of environment dynamics to tighten reachability bounds in real time.
Integration of explicit policy conditioning on dynamic safety constraints to enhance flexibility in constraint-aware replanning.
Expansion to multi-agent and multi-robot collaborative settings where joint safety verification is required.

Conclusion

PACS offers a rigorous, implementation-ready framework for the safe deployment of chunk-based diffusion policies in dynamic, human-centric environments. By making safety filtering path-consistent via trajectory-level reachability analysis, PACS eliminates the typical trade-off between safety and task performance. Robust empirical evidence shows that PACS maintains high task success rates and strong safety, even compared to the best classical reactive strategies. This framework advances the deployment of high-dimensional, generative policies for manipulation in operational settings demanding both adaptability and strict safety guarantees.