Papers
Topics
Authors
Recent
Search
2000 character limit reached

Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline

Published 16 Jun 2022 in cs.CV, cs.AI, and cs.RO | (2206.08129v2)

Abstract: Current end-to-end autonomous driving methods either run a controller based on a planned trajectory or perform control prediction directly, which have spanned two separately studied lines of research. Seeing their potential mutual benefits to each other, this paper takes the initiative to explore the combination of these two well-developed worlds. Specifically, our integrated approach has two branches for trajectory planning and direct control, respectively. The trajectory branch predicts the future trajectory, while the control branch involves a novel multi-step prediction scheme such that the relationship between current actions and future states can be reasoned. The two branches are connected so that the control branch receives corresponding guidance from the trajectory branch at each time step. The outputs from two branches are then fused to achieve complementary advantages. Our results are evaluated in the closed-loop urban driving setting with challenging scenarios using the CARLA simulator. Even with a monocular camera input, the proposed approach ranks first on the official CARLA Leaderboard, outperforming other complex candidates with multiple sensors or fusion mechanisms by a large margin. The source code is publicly available at https://github.com/OpenPerceptionX/TCP

Citations (137)

Summary

  • The paper presents Trajectory-guided Control Prediction (TCP), a novel framework integrating trajectory planning and direct control prediction to improve end-to-end autonomous driving.
  • TCP employs a dual-branch architecture with multi-step control prediction and a trajectory-guided attention mechanism to combine the strengths of both approaches.
  • Evaluations on the CARLA simulator show TCP achieving a top driving score with only a monocular camera, demonstrating superior performance over traditional control-only or trajectory-only methods.

Overview of Trajectory-guided Control Prediction for Autonomous Driving

The paper "Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline" presents a novel framework, TCP (Trajectory-guided Control Prediction), that aims to address the trade-off between trajectory planning and direct control prediction in autonomous driving. The approach integrates these two paradigms into a cohesive model, thereby leveraging their complementary strengths to improve autonomous driving.

Background and Motivation

Traditional end-to-end autonomous driving frameworks predominantly focus on either trajectory planning or direct control prediction from raw sensor inputs. These approaches have been developed and tested separately, each exhibiting distinct advantages and challenges. Trajectory planning, which involves predicting waypoints or future trajectories, offers extended temporal foresight and can be enriched with other predictive modules to improve safety and collision avoidance. However, it typically requires additional controllers like PID or model predictive controllers to translate planned trajectories into actionable control signals, which can complicate the framework and reduce responsiveness in dynamic scenarios.

On the other hand, direct control prediction methods provide optimized control signals, such as throttle, brake, and steering, by focusing primarily on the current moment. Although they simplify the control inference process, any long-term sequences they derive are underrepresented compared to trajectory planning, potentially leading to instability or delayed reactions to contextual changes or obstacles.

Methodology

TCP proposes a unified solution that integrates trajectory planning and control prediction, operated on multi-task learning (MTL) principles. This framework supports a dual-branch architecture, with one branch dedicated to predicting future trajectories while the other focuses on control actions. The integration is achieved through shared inputs and an attention mechanism that allows inter-branch communication.

Key Components:

  • Multi-step Control Prediction: By forecasting not just immediate but also successive control actions, TCP considers the sequential dependencies of driving decisions, thus addressing the limitations of single-step IID assumptions in behavior cloning.
  • Trajectory-guided Attention Mechanism: This method uses the trajectory branch’s output to inform the control branch on which spatial regions of the environment to attend to, enhancing decision-making over multiple future steps.
  • Situation-based Fusion Scheme: TCP employs a dynamic combination of trajectory and control outputs, weighted by predetermined conditions (e.g., road scenarios like turning), thereby optimizing the output for varied driving contexts.

Results and Implications

The effectiveness of TCP is demonstrated through evaluations conducted on the CARLA simulator, where it achieved superior performance metrics, including a top driving score on the CARLA Leaderboard. This was accomplished using a monocular camera, highlighting the framework's efficiency compared to previous methods reliant on multiple sensors and modalities.

Numerical and Qualitative Analysis:

  • A notable improvement in driving score and infraction handling was observed compared to both control-only and trajectory-only models.
  • Extensive ablation studies confirmed the efficacy of each design component, especially the multi-step and attention-guided features.

Future Directions and Challenges

The model runs into possibilities for future work that include enhancing the fidelity of simulation interactions and further refining the integration strategy for even broader driving scenarios. Moreover, while TCP outperformed in key metrics, real-world implementation and adaptability across different urban environments remain challenging aspects to address for stronger generalization.

TCP sets a foundational baseline in exploring the synergy between trajectory planning and direct control in autonomous driving. Its innovative integration and task-sharing paradigm offer potential pathways towards more complex, reliable multi-task learning applications in autonomous systems, promoting both theoretical insights and practical advancements in AI-based vehicular control systems.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

GitHub