QPlanner Model: Modular QP Frameworks
- QPlanner is a modular framework that operationalizes quadratic programming for planning and sequence generation, with applications in robotics, autonomous vehicles, and retrieval-augmented generation.
- In robotics, QPlanner automates convex QP formulation to map high-level tasks into efficient control schemes, achieving millisecond-scale performance in inverse kinematics and whole-body control.
- For autonomous vehicles and RAG, QPlanner integrates global dynamic programming with local QP refinement and LLM-based sequence modeling to optimize path planning and query outline generation.
QPlanner is the denomination for multiple, independently developed models and frameworks that operationalize quadratic programming (QP)-based planning or structure-conditioned sequence generation, depending on domain. This article surveys three distinct instantiations of QPlanner: as the core abstraction for convex QP-based robot planning (PlaCo) (Duclusaud et al., 8 Nov 2025), as a hybrid dynamic programming–QP trajectory planner for autonomous vehicles (Zhang, 2024), and as a pretrained sequence model for query outline selection under retrieval-augmented generation (RAG) (Kim et al., 2024). These usages share an emphasis on modularity in complex decision domains, but substantially differ in their algorithmic substrate, application, and evaluation.
1. QPlanner in QP-based Robot Planning and Control
QPlanner, as realized within the PlaCo framework (Duclusaud et al., 8 Nov 2025), designates PlaCo’s core abstraction layer for automatically assembling and solving convex QP formulations encompassing planning and whole-body control for robotics. PlaCo generalizes all such problems to the standard QP form: Here, the decision vector concatenates all planning or control variables (joint increments, slack variables, integrated command sequences, etc.). The matrices are assembled algorithmically from user-specified tasks, hard/soft constraints, and, for time-extended planning, discrete-time dynamics.
Mapping of semantic robotic tasks to QP terms is direct: each soft equality task of form with weight translates to and . For instance, end-effector pose, center-of-mass, posture tracking, gear coupling, and support polygon constraints are all encoded with automatically generated Jacobians and offsets. Soft inequalities introduce extra slack variables.
The software encapsulates this mapping in a layered architecture: Python bindings permit rapid prototyping (e.g., <10 lines to define an inverse kinematics (IK) QP), while a C++ back end (with Eigen, EiQuadProg) supports millisecond-scale latency (e.g., ≈200 μs/iteration on a 6-DOF arm, ≈700 μs on 30-DOF humanoids). Integration with solvers (qpOASES, OSQP) is modular. Representative primitives include real-time humanoid locomotion (whole-body IK with dynamic constraints), quadruped balancing under stance and CoM constraints, and joint coupling via gear/differential tasks.
This abstraction eliminates the need for end-users to directly manipulate matrix-valued QP definitions, instead declaring only the task/constraint structure and priorities, dramatically accelerating development and deployment of convex model-based control schemes (Duclusaud et al., 8 Nov 2025).
2. QPlanner for Autonomous Vehicle Decision-Making and Planning
In autonomous driving, QPlanner refers to a two-level path planning and trajectory optimization framework that integrates global DP-based search with local QP refinement, as detailed in (Zhang, 2024). The global planner models the environment in discretized Frenet coordinates, optimizing reference-lane-aligned paths via DP over stages, actions, and cost-to-go functions. The DP output is a table , lateral offsets along the road.
Next, the local QP module, re-solved at high frequency (e.g., 10 Hz), optimizes within a sliding window: where are the horizon's lateral positions, and encode objectives for path adherence, smoothness, derivative regularization, and mid-lane bias. Matrix operators discretize derivatives (for , , ). Constraints include road and obstacle boundaries, jerk and curvature comfort limits, and linearized vehicle kinematics. Obstacle avoidance exploits S–T graphs by encoding dynamic object trajectories as time-indexed linear inequalities on , incorporating both static and dynamic hindrances.
Integration of DP and QP proceeds via replan triggers on dynamic obstacle incursions, with DP recomputed as needed (time ≈120 ms), while QP cycles at ≈5–8 ms per solve. In empirical evaluations, the system achieves real-time tractability (<100 ms end-to-end for all modules), precise path following (average cross-track error 0.08 m), and high obstacle avoidance reliability (Zhang, 2024).
3. QPlanner as a Sequence Model for Coverage-Conditioned Retrieval-Augmented Generation
A third QPlanner instantiation is a 7-billion-parameter LLM for query outline generation under explicit coverage constraints (“coverage-conditioned,” or ) in RAG architectures (Kim et al., 2024). Here, QPlanner is a Llama 2-7B-Chat derivative (32 layers, 4096-dimension, 32 heads), unmodified in network topology, trained to produce hierarchical decompositions (QTree) and then select four-node outlines from these decompositions to satisfy user inclusion/exclusion instructions.
Supervision leverages the QTree dataset, comprising 10,000 hierarchically organized sets of subqueries generated by prompting GPT-4, with scenarios induced by randomly pairing base questions with explicit constraints. Candidate outlines are further filtered and ranked by GPT-4, with the highest-scoring outline forming the gold target. QPlanner undergoes supervised fine-tuning (SFT) for next-token prediction over 31,488 samples, followed by Direct Preference Optimization (DPO) to align with preferred human/GPT-4 outline rankings (8,568 preference pairs).
Deployed in a RAG pipeline, QPlanner-delineated outlines serve twofold: each subquery is used for targeted document retrieval (DuckDuckGo, 2 docs/query), and outline-driven structure conditions downstream generative answer synthesis. Evaluation by automatic (GPT-4) and human scoring shows statistically significant increases in outline-constrained answer satisfaction, with DPO-enhanced QPlanner outperforming baselines by up to 60% in pairwise preference and improving adherence mean scores from 2.79 (SFT) to 3.16 (DPO) for a five-point scale. Exclusion-type constraints remain harder, with slightly lower mean alignment scores.
In end-to-end RAG, outline-driven retrieval/answering significantly outperforms vanilla baselines in McNemar’s test (χ²=60, ). Moreover, within satisfactory responses, outline-driven outputs are preferred in over 65% of cases (Kim et al., 2024).
4. Algorithmic Features and Interfaces
Across these domains, QPlanner shares an emphasis on structured, modular problem definition and transparent mapping from high-level semantic requirements (tasks, constraints, decomposition) to executable optimization or sequence selection. In PlaCo’s implementation, Python APIs allow for succinct instantiation of tasks and constraints (e.g., add_position_task, add_com_polygon_constraint), supporting both rapid prototyping and performance-critical deployment. Solver routines support multiple QP engines (EiQuadProg, qpOASES, OSQP). Integrated routines automatically introduce Jacobian, slack augmentation, or dynamic constraints as required by the problem instance (Duclusaud et al., 8 Nov 2025).
In the autonomous vehicle setting, QPlanner's interface demarcates clear information flow between global (DP) and local (QP) modules, with SLAM-perceived obstacles driving QP constraint updates and trajectory tracking employing optimized QP outputs (Zhang, 2024).
For the RAG domain, QPlanner is prompted with a query, emits the QTree and selected outline as textual chains-of-thought, and seamlessly integrates as a front-end for multi-turn evidence retrieval and answer composition pipelines (Kim et al., 2024).
5. Performance Benchmarks and Empirical Outcomes
Performance data in each domain corroborates QPlanner's utility under both time and accuracy constraints. In PlaCo, IK solves on standard 6-DOF manipulators occur in ≈200 μs; 30-DOF humanoid problems resolve at ≈700 μs per iteration, allowing real-time (<1 kHz) deployment for full-body robotic behaviors (Duclusaud et al., 8 Nov 2025). The autonomous vehicle QPlanner maintains QP solve times of ≈5–8 ms per cycle and admits global DP replanning at ≈120 ms, with average lateral errors ≤0.08 m and perfect (>99%) dynamic obstacle avoidance in benchmark scenarios (Zhang, 2024). LLM-based QPlanner achieves clear alignment gains in outline adherence and downstream answer preferences, with improvements robust to variation in intent operation type and query complexity (Kim et al., 2024).
6. Limitations and Prospective Directions
While QPlanner abstractions enable rapid development and improved specification fidelity, all present formulations exhibit limitations. In PlaCo, only convex QPs are natively supported, and dynamic constraints are restricted to those linearizable via discrete-time integrators. The vehicle planning QPlanner fixes the outline horizon and currently favors soft over hard enforcement of vehicle kinematic constraints, with performance contingent on quality of DP path under highly dynamic environments. The LLM-based QPlanner prescribes a fixed four-step outline per query; the inability to flexibly vary outline length is identified as a constraint on representational adequacy for complex topics. Further, end-to-end utility is sensitive to downstream retrieval and fact-verification module reliability.
This suggests that as QPlanner architectures evolve, future work will entail support for variable-length structure induction, integrated fact-verification (in RAG), handling non-convex or nonlinear dynamics, and richer benchmarking for coverage-conditioned sequence modeling (Kim et al., 2024).
7. Comparative Summary
| Domain | Optimization Substrate | Typical Usage |
|---|---|---|
| Robotics planning/control (Duclusaud et al., 8 Nov 2025) | Convex QP assembly, IK solvers | Whole-body IK, constrained planning, MPC |
| Autonomous vehicle (Zhang, 2024) | Hybrid DP (global) + QP (local) | Lane-following, obstacle avoidance, trajectory MPC |
| RAG/LLMs (Kim et al., 2024) | SFT/DPO fine-tuned transformer | outline generation, RAG query selection |
The QPlanner designation thus spans a class of planning and structural reasoning modules facilitating specification-driven optimization or outline selection in high-dimensional or constraint-sensitive contexts, abstracting underlying complexity by mapping user/task requirements to formally tractable representations or sequence outputs.