Papers
Topics
Authors
Recent
Search
2000 character limit reached

QPlanner Model: Modular QP Frameworks

Updated 6 February 2026
  • QPlanner is a modular framework that operationalizes quadratic programming for planning and sequence generation, with applications in robotics, autonomous vehicles, and retrieval-augmented generation.
  • In robotics, QPlanner automates convex QP formulation to map high-level tasks into efficient control schemes, achieving millisecond-scale performance in inverse kinematics and whole-body control.
  • For autonomous vehicles and RAG, QPlanner integrates global dynamic programming with local QP refinement and LLM-based sequence modeling to optimize path planning and query outline generation.

QPlanner is the denomination for multiple, independently developed models and frameworks that operationalize quadratic programming (QP)-based planning or structure-conditioned sequence generation, depending on domain. This article surveys three distinct instantiations of QPlanner: as the core abstraction for convex QP-based robot planning (PlaCo) (Duclusaud et al., 8 Nov 2025), as a hybrid dynamic programming–QP trajectory planner for autonomous vehicles (Zhang, 2024), and as a pretrained sequence model for query outline selection under retrieval-augmented generation (RAG) (Kim et al., 2024). These usages share an emphasis on modularity in complex decision domains, but substantially differ in their algorithmic substrate, application, and evaluation.

1. QPlanner in QP-based Robot Planning and Control

QPlanner, as realized within the PlaCo framework (Duclusaud et al., 8 Nov 2025), designates PlaCo’s core abstraction layer for automatically assembling and solving convex QP formulations encompassing planning and whole-body control for robotics. PlaCo generalizes all such problems to the standard QP form: minimizex12xPx+ax subject toGxh Ax=b\begin{aligned} \text{minimize}_x \quad & \frac{1}{2} x^\top P x + a^\top x \ \text{subject to} \quad & Gx \leq h \ & Ax = b \end{aligned} Here, the decision vector xx concatenates all planning or control variables (joint increments, slack variables, integrated command sequences, etc.). The matrices (P,a,G,h,A,b)(P, a, G, h, A, b) are assembled algorithmically from user-specified tasks, hard/soft constraints, and, for time-extended planning, discrete-time dynamics.

Mapping of semantic robotic tasks to QP terms is direct: each soft equality task of form Mixvi2\|M_i x - v_i\|^2 with weight wiw_i translates to PP+wiMiMiP \leftarrow P + w_i M_i^\top M_i and aawiMivia \leftarrow a - w_i M_i^\top v_i. For instance, end-effector pose, center-of-mass, posture tracking, gear coupling, and support polygon constraints are all encoded with automatically generated Jacobians and offsets. Soft inequalities introduce extra slack variables.

The software encapsulates this mapping in a layered architecture: Python bindings permit rapid prototyping (e.g., <10 lines to define an inverse kinematics (IK) QP), while a C++ back end (with Eigen, EiQuadProg) supports millisecond-scale latency (e.g., ≈200 μs/iteration on a 6-DOF arm, ≈700 μs on 30-DOF humanoids). Integration with solvers (qpOASES, OSQP) is modular. Representative primitives include real-time humanoid locomotion (whole-body IK with dynamic constraints), quadruped balancing under stance and CoM constraints, and joint coupling via gear/differential tasks.

This abstraction eliminates the need for end-users to directly manipulate matrix-valued QP definitions, instead declaring only the task/constraint structure and priorities, dramatically accelerating development and deployment of convex model-based control schemes (Duclusaud et al., 8 Nov 2025).

2. QPlanner for Autonomous Vehicle Decision-Making and Planning

In autonomous driving, QPlanner refers to a two-level path planning and trajectory optimization framework that integrates global DP-based search with local QP refinement, as detailed in (Zhang, 2024). The global planner models the environment in discretized Frenet coordinates, optimizing reference-lane-aligned paths via DP over stages, actions, and cost-to-go functions. The DP output is a table {li}\{l^*_i\}, lateral offsets along the road.

Next, the local QP module, re-solved at high frequency (e.g., 10 Hz), optimizes within a sliding window: minx  12xHx+fx\min_{x} \; \frac{1}{2} x^\top H x + f^\top x where x=[l0,l1,,lNh]x = [l_0, l_1, \dots, l_{N_h}]^\top are the horizon's lateral positions, and H,fH, f encode objectives for path adherence, smoothness, derivative regularization, and mid-lane bias. Matrix operators (D1,D2,D3)(D_1, D_2, D_3) discretize derivatives (for ll', ll'', ll'''). Constraints include road and obstacle boundaries, jerk and curvature comfort limits, and linearized vehicle kinematics. Obstacle avoidance exploits S–T graphs by encoding dynamic object trajectories as time-indexed linear inequalities on lil_i, incorporating both static and dynamic hindrances.

Integration of DP and QP proceeds via replan triggers on dynamic obstacle incursions, with DP recomputed as needed (time ≈120 ms), while QP cycles at ≈5–8 ms per solve. In empirical evaluations, the system achieves real-time tractability (<100 ms end-to-end for all modules), precise path following (average cross-track error 0.08 m), and high obstacle avoidance reliability (Zhang, 2024).

3. QPlanner as a Sequence Model for Coverage-Conditioned Retrieval-Augmented Generation

A third QPlanner instantiation is a 7-billion-parameter LLM for query outline generation under explicit coverage constraints (“coverage-conditioned,” or C2C^2) in RAG architectures (Kim et al., 2024). Here, QPlanner is a Llama 2-7B-Chat derivative (32 layers, 4096-dimension, 32 heads), unmodified in network topology, trained to produce hierarchical decompositions (QTree) and then select four-node outlines from these decompositions to satisfy user inclusion/exclusion instructions.

Supervision leverages the QTree dataset, comprising 10,000 hierarchically organized sets of subqueries generated by prompting GPT-4, with C2C^2 scenarios induced by randomly pairing base questions with explicit constraints. Candidate outlines are further filtered and ranked by GPT-4, with the highest-scoring outline forming the gold target. QPlanner undergoes supervised fine-tuning (SFT) for next-token prediction over 31,488 samples, followed by Direct Preference Optimization (DPO) to align with preferred human/GPT-4 outline rankings (8,568 preference pairs).

Deployed in a RAG pipeline, QPlanner-delineated outlines serve twofold: each subquery is used for targeted document retrieval (DuckDuckGo, 2 docs/query), and outline-driven structure conditions downstream generative answer synthesis. Evaluation by automatic (GPT-4) and human scoring shows statistically significant increases in outline-constrained answer satisfaction, with DPO-enhanced QPlanner outperforming baselines by up to 60% in pairwise preference and improving C2C^2 adherence mean scores from 2.79 (SFT) to 3.16 (DPO) for a five-point scale. Exclusion-type constraints remain harder, with slightly lower mean alignment scores.

In end-to-end RAG, outline-driven retrieval/answering significantly outperforms vanilla baselines in McNemar’s test (χ²=60, p<0.001p<0.001). Moreover, within satisfactory responses, outline-driven outputs are preferred in over 65% of cases (Kim et al., 2024).

4. Algorithmic Features and Interfaces

Across these domains, QPlanner shares an emphasis on structured, modular problem definition and transparent mapping from high-level semantic requirements (tasks, constraints, decomposition) to executable optimization or sequence selection. In PlaCo’s implementation, Python APIs allow for succinct instantiation of tasks and constraints (e.g., add_position_task, add_com_polygon_constraint), supporting both rapid prototyping and performance-critical deployment. Solver routines support multiple QP engines (EiQuadProg, qpOASES, OSQP). Integrated routines automatically introduce Jacobian, slack augmentation, or dynamic constraints as required by the problem instance (Duclusaud et al., 8 Nov 2025).

In the autonomous vehicle setting, QPlanner's interface demarcates clear information flow between global (DP) and local (QP) modules, with SLAM-perceived obstacles driving QP constraint updates and trajectory tracking employing optimized QP outputs (Zhang, 2024).

For the RAG domain, QPlanner is prompted with a C2C^2 query, emits the QTree and selected outline as textual chains-of-thought, and seamlessly integrates as a front-end for multi-turn evidence retrieval and answer composition pipelines (Kim et al., 2024).

5. Performance Benchmarks and Empirical Outcomes

Performance data in each domain corroborates QPlanner's utility under both time and accuracy constraints. In PlaCo, IK solves on standard 6-DOF manipulators occur in ≈200 μs; 30-DOF humanoid problems resolve at ≈700 μs per iteration, allowing real-time (<1 kHz) deployment for full-body robotic behaviors (Duclusaud et al., 8 Nov 2025). The autonomous vehicle QPlanner maintains QP solve times of ≈5–8 ms per cycle and admits global DP replanning at ≈120 ms, with average lateral errors ≤0.08 m and perfect (>99%) dynamic obstacle avoidance in benchmark scenarios (Zhang, 2024). LLM-based QPlanner achieves clear alignment gains in C2C^2 outline adherence and downstream answer preferences, with improvements robust to variation in intent operation type and query complexity (Kim et al., 2024).

6. Limitations and Prospective Directions

While QPlanner abstractions enable rapid development and improved specification fidelity, all present formulations exhibit limitations. In PlaCo, only convex QPs are natively supported, and dynamic constraints are restricted to those linearizable via discrete-time integrators. The vehicle planning QPlanner fixes the outline horizon and currently favors soft over hard enforcement of vehicle kinematic constraints, with performance contingent on quality of DP path under highly dynamic environments. The LLM-based QPlanner prescribes a fixed four-step outline per query; the inability to flexibly vary outline length is identified as a constraint on representational adequacy for complex topics. Further, end-to-end utility is sensitive to downstream retrieval and fact-verification module reliability.

This suggests that as QPlanner architectures evolve, future work will entail support for variable-length structure induction, integrated fact-verification (in RAG), handling non-convex or nonlinear dynamics, and richer benchmarking for coverage-conditioned sequence modeling (Kim et al., 2024).

7. Comparative Summary

Domain Optimization Substrate Typical Usage
Robotics planning/control (Duclusaud et al., 8 Nov 2025) Convex QP assembly, IK solvers Whole-body IK, constrained planning, MPC
Autonomous vehicle (Zhang, 2024) Hybrid DP (global) + QP (local) Lane-following, obstacle avoidance, trajectory MPC
RAG/LLMs (Kim et al., 2024) SFT/DPO fine-tuned transformer C2C^2 outline generation, RAG query selection

The QPlanner designation thus spans a class of planning and structural reasoning modules facilitating specification-driven optimization or outline selection in high-dimensional or constraint-sensitive contexts, abstracting underlying complexity by mapping user/task requirements to formally tractable representations or sequence outputs.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to QPlanner Model.