Augmented HQP for Collaborative Robotics
- Augmented HQP is a real-time control framework for human-robot collaboration that hierarchically integrates vision-based action recognition, adaptive workspace constraints, and ergonomic optimization.
- The framework simultaneously optimizes robot joint velocities and end-effector targets, ensuring precise, posture-friendly, and intuitive task execution.
- Empirical results demonstrate improved operator usability, reduced workload, and enhanced task performance in tool handover and cooperative manipulation scenarios.
Augmented Hierarchical Quadratic Programming (AHQP) is a real-time, lexicographic multi-task control framework for human-robot collaboration that hierarchically integrates vision-based human action recognition, adaptive workspace soft constraints, and ergonomics optimization. AHQP adapts the standard "stack-of-tasks" hierarchical quadratic programming (HQP) paradigm by simultaneously optimizing both robot joint velocities and desired end-effector (EE) velocities, enabling fine-grained, ergonomic, and sociable robot behavior during collaborative tasks. This formulation allows intuitive human guidance of robotic systems, seamless online adaptation of shared workspaces, and direct incorporation of posture-friendly motion, as demonstrated in tasks such as tool handover, cooperative manipulation, and human-following mobile manipulation (Tassi et al., 2022).
1. Mathematical Structure and Optimization Hierarchy
The AHQP framework is grounded in the HQP formalism, which solves a stack of prioritized quadratic programs for a generic decision variable , subject at each priority level to distinct costs and constraints:
with , defining the cost for the -th task, , the inequality constraints, and , 0 as equality constraints. Enforcing strict prioritization requires lower-priority solutions to lie in the nullspace of higher-priority tasks, operationalized by stacking equality constraints from superior levels.
A central innovation in AHQP lies in the design of the augmented decision variable:
1
where 2 are the 3-DoF robot joint velocities and 4 are the desired end-effector velocities simultaneously optimized online. This enables a tight coupling of robot kinematics with workspace and ergonomics objectives (Tassi et al., 2022).
2. Task Levels and Cost Functions
AHQP organizes control objectives into a lexicographically ordered stack of three core levels:
- Primary Task: Closed-Loop Inverse Kinematics (CLIK) The top-priority QP enforces a closed-loop inverse-kinematics objective over 5. With 6 the robot Jacobian, 7 the actual EE pose, and 8 the previous desired pose, the cost is
9
where 0 is the proportional gain and 1 the integration step.
- Secondary Task: Soft Constraint on Shared Workspace (HRSW) To allow dynamic deformation of the human-robot shared workspace, the EE goal 2 is constrained via a slackened box constraint
3
where 4 is a slack vector penalized in a second-level QP
5
- Tertiary Task: Ergonomics Optimization Ergonomics is addressed by fitting a lightweight Cartesian map 6 to approximate REBA-based human comfort scores from hand data, resulting in a quadratic tertiary cost
7
with 8 and 9 obtained offline from human demonstrations.
The solver proceeds lexicographically, at each level projecting the candidate solution into the nullspace of higher-priority tasks to ensure strict task ordering (Tassi et al., 2022).
3. Augmentation Relative to Standard HQP
Standard HQP for inverse kinematics fixes 0 a priori from a trajectory generator, optimizing 1 exclusively. AHQP augments this structure by treating 2 as an optimization variable, yielding several key modifications:
- The CLIK cost incorporates 3 directly in the first-level QP.
- The slack cost on 4 explicitly softens workspace boundaries, allowing adaptive shaping of the shared human-robot region in real time.
- The ergonomics cost is handled directly within the same hierarchical stack, removing the need for external planners.
This unification enables real-time integration of kinematic consistency, workspace adaptation, and ergonomic optimization within a single lexicographic QP loop (Tassi et al., 2022).
4. Vision-Driven Task Adaptation and Human-Robot Interaction
AHQP tightly couples external vision modules with hierarchical control via:
- Object-Surface Classification:
A ResNet+SVM classifier on RGB images provides a binary output 5; during collaborative states (recognized via action detection), this constraint dynamically restricts EE orientation by updating 6.
- Action Recognition:
SlowOnly@ResNet50, pretrained on Kinetics-400, infers a class confidence vector 7. If the "start walk" action is top-ranked, the HRSW window shifts adaptively along direction 8, modifying workspace bounds as
9
where 0 is a step parameter.
- 3D Human-Hand Tracking:
OpenPose processes RGB-D frames to estimate human hand positions 1. Assuming 2, the ergonomics cost 3 can be evaluated directly online.
This perception pipeline allows for instantaneous, vision-driven adaptation of both workspace and ergonomic constraints, enabling sociable human-commanded robot behavior (Tassi et al., 2022).
5. End-to-End Algorithmic Workflow
AHQP's real-time control stack, as implemented on the MOCA platform (Franka Panda with mobile base), follows this loop at approximately 1 kHz:
- Acquire camera streams (RGB and RGB-D).
- Classify object surfaces via ResNet+SVM to determine 4.
- Extract hand positions 5 using OpenPose keypoints.
- Recognize human actions using SlowOnly@ResNet50, yielding 6.
- Decode collaborative state and direction; update shared workspace constraints and EE goals accordingly.
- Assemble Level 1 cost (7, 8) with joint and EE limits.
- Define Level 2 cost (9) and constraints (0, 1).
- Add Level 3 ergonomics cost (2).
- Solve the lexicographic QP for optimal 3 and 4. 10. Issue 5 to robot control (impedance law) and update 6 with 7.
This pipeline enables fluid, closed-loop adaptation to human intent and environment while prioritizing both task performance and user comfort (Tassi et al., 2022).
6. Empirical Validation and Performance
The AHQP framework demonstrates the following quantitative results on the MOCA platform:
- Object-surface classification: 100% accuracy with ResNet50+SVM over a 2,000-image benchmark.
- Action recognition: 86.55% accuracy on the HRI30 dataset with SlowOnly@ResNet50 pretrained on Kinetics-400.
- Ergonomics in tool-handover: AHQP maintains a human ergonomics score 8 versus 4.2 without ergonomics optimization.
- Iterative workspace adaptation: During “follow-the-human” trials, cumulative REBA-based scores remain below 1.0 with AHQP and rise above 3.5 without ergonomics.
- Operator usability: NASA-TLX surveys reveal reduced mental and physical load, and improved perceived performance, for users assisted by AHQP's ergonomics optimization.
Collectively, these results highlight AHQP's effectiveness in fusing action-recognition signals, adaptive workspace soft constraints, and ergonomics into a unified real-time controller, yielding fluid and sociable human-robot interaction while promoting user comfort and trust in automation (Tassi et al., 2022).