LVLM-MPC Collaboration for Autonomous Driving: A Safety-Aware and Task-Scalable Control Architecture
The paper "LVLM-MPC Collaboration for Autonomous Driving: A Safety-Aware and Task-Scalable Control Architecture" presents an innovative framework for autonomous driving applications, integrating a Large Vision-LLM (LVLM) with Model Predictive Control (MPC). This integration aims to ensure both task scalability and safety in autonomous driving environments.
Overview of the Hybrid Architecture
The proposed framework leverages LVLMs to handle high-level task planning across diverse driving scenarios. LVLMs provide symbolic task commands that are processed by MPC Builder, which automatically generates the appropriate MPCs. This hybrid architecture is designed to address two critical challenges in autonomous driving: scalability of task planning and safety assurance during execution.
Scalability and Safety Assurance
Scalability: Traditional model-based planners like MPC often struggle with scalability when combined with the extensive task space of modern LVLMs. The paper suggests that hand-crafting controllers for a limited set of tasks restricts versatility. Hence, the integration of MPC Builder with LVLMs facilitates rapid adaptation to various tasks by synthesizing MPCs from a compact library of primitive design elements.
Safety Assurance: Existing frameworks typically follow a unidirectional pipeline where high-level commands succeed or fail silently without dynamic feedback. The proposed bidirectional communication between LVLMs and MPC ensures that infeasibility is handled effectively through automatic rejection and constructive replanning. The notion of intermediate Optimal Control Problems (iOCP) is introduced to ensure smooth task transitions, mitigating deadlocks and wasted computation.
Simulation and Results
The effectiveness of the LVLM-MPC collaboration is demonstrated through highway driving simulations. The results indicate superior safety and efficiency compared to baseline methods. LVLM-MPCBuilder consistently maintained a 100% success rate in simulations, highlighting its capability to navigate congested traffic safely while accommodating flexible and adaptable task execution.
Implications and Future Work
The integration of LVLMs and MPC presents significant implications for autonomous vehicle design. The framework efficiently bridges the gap between high-level semantic reasoning provided by foundation models and precise, reliable vehicle operation afforded by MPC. Future advancements may deliver more complex scenario handling and real-world deployment. Additionally, expanding this approach to encompass more diverse environmental conditions and task requirements poses intriguing avenues for further research.
In conclusion, this paper articulates a compelling case for the synergistic use of LVLMs and MPC in autonomous driving. The proposed architecture not only meets the demands of scalability and safety but also provides a robust platform for exploring next-generation autonomous vehicle capabilities.