Real-Time Out-of-Distribution Failure Prevention via Multi-Modal Reasoning (2505.10547v1)

Published 15 May 2025 in cs.RO and cs.AI

Abstract: Foundation models can provide robust high-level reasoning on appropriate safety interventions in hazardous scenarios beyond a robot's training data, i.e. out-of-distribution (OOD) failures. However, due to the high inference latency of Large Vision and LLMs, current methods rely on manually defined intervention policies to enact fallbacks, thereby lacking the ability to plan generalizable, semantically safe motions. To overcome these challenges we present FORTRESS, a framework that generates and reasons about semantically safe fallback strategies in real time to prevent OOD failures. At a low frequency in nominal operations, FORTRESS uses multi-modal reasoners to identify goals and anticipate failure modes. When a runtime monitor triggers a fallback response, FORTRESS rapidly synthesizes plans to fallback goals while inferring and avoiding semantically unsafe regions in real time. By bridging open-world, multi-modal reasoning with dynamics-aware planning, we eliminate the need for hard-coded fallbacks and human safety interventions. FORTRESS outperforms on-the-fly prompting of slow reasoning models in safety classification accuracy on synthetic benchmarks and real-world ANYmal robot data, and further improves system safety and planning success in simulation and on quadrotor hardware for urban navigation.

Summary

The paper presents a novel framework termed FORTRESS, designed to address the challenge of Out-of-Distribution (OOD) failures in autonomous robots operating in open-world environments. This work investigates the intersection of multi-modal reasoning and dynamics-aware planning to generate semantically safe fallback strategies in real-time, ensuring system reliability in unpredictable surroundings.

Framework Overview

FORTRESS leverages foundation models to anticipate failure modes, identify fallback strategy goals, and calibrate semantic safety constraints. The framework operates on a low-frequency basis or offline during nominal operations, thus mitigating the latency issues associated with querying foundation models in critical moments. Upon detecting anomalies, FORTRESS synthesizes a fallback plan by analyzing semantic safety constraints alongside dynamics-aware planning methodologies.

Methodological Contributions

FORTRESS introduces an approach for the reliable identification of OOD scenarios and the formulation of adaptive strategies to manage these circumstances. Key contributions of the methodology include:

Multi-modal Reasoning: Integrating Vision-LLMs (VLMs) to translate abstract fallback strategies into physical goal coordinates, enabling future-proof planning without hard-coded fallback scenarios.
Semantic Safety Constraints Detection: Utilizing text embedding models to preemptively anticipate a set of high-level failure modes, thus allowing the system to navigate around semantically unsafe regions in real-time.
Dynamics-aware Planning: Incorporating reach-avoid analysis to construct semantically safe trajectories that align with the robot’s dynamics, employing techniques such as RRT and MPC to ensure trajectory feasibility and safety.

Evaluation and Results

To evaluate the efficacy of FORTRESS, the authors conduct extensive testing across synthetic datasets for different robotics domains, including drones, boats, and vehicles, as well as real-world data from ANYmal robot experiments. These experiments assessed the framework's ability to identify semantically unsafe scenarios and execute safe fallback plans effectively. Noteworthy results include:

Semantic Safety Classification: Achieving balanced accuracy over 90% in detecting OOD failures using semantic embeddings.
Real-time Planning: Demonstrating improved planning success and safety in real-time scenarios compared to baselines that rely on static fallback plans or naïve object avoidance.

Implications and Future Work

The implications of this paper span both theoretical and practical domains. Theoretically, FORTRESS proposes a robust mechanism for real-time reasoning in open-ended environments, paving the way for future research in autonomous planning under uncertainty. Practically, the framework holds promise for enhancing robotic resilience and safety in dynamic and unstructured settings, where traditional planning methods may fail.

However, the framework does have limitations, such as the reliance on pre-trained models which may not universally adapt across all robotic platforms. Future work may focus on refining dynamic fallback goals and further automating the generation of semantic strategies. There is also scope for extending FORTRESS to handle more complex, dynamic boundaries in semantic safety constraints.

In conclusion, FORTRESS represents a significant advance in robotics safety and planning, leveraging the capabilities of foundation models to prevent OOD failures in real-time effectively. The framework's client-server architecture underscores the increasing role of AI in enhancing robotic decision-making under uncertainty.