Uncertainty-Aware Shared Autonomy System

Updated 25 October 2025

Uncertainty-aware shared autonomy systems are Bayesian frameworks that fuse explicit uncertainty modeling of latent user goals with human input for robust collaboration.
They leverage probabilistic models such as POMDP and MaxEnt IOC, using belief updates and hindsight optimization to infer intent and optimize action selection.
Empirical studies reveal reduced task completion times and lower user input effort, though they also highlight a trade-off between operational efficiency and perceived human control.

An uncertainty-aware shared autonomy system combines user input and autonomous behavior in a manner that explicitly models, estimates, and leverages uncertainty—often over user goals, environment state, operator intent, or downstream task outcome—to produce more robust, adaptive, and efficient human-robot collaboration. These systems typically formalize assistance as a Partially Observable Markov Decision Process (POMDP) or a related probabilistic framework, develop principled methods for intent inference and belief update, and select robot actions that optimize for expected outcomes under uncertainty. Modern implementations further address the trade-off between efficiency and operator satisfaction, dynamically arbitrate control authority, and incorporate user feedback regarding perceived control, task performance, and trust.

1. Formal Models of Uncertainty in Shared Autonomy

The typical foundation for uncertainty-aware shared autonomy is a POMDP where the agent state is augmented to include a latent user goal $g$ , alongside the physical robot state $x$ , so $s = (x, g)$ . The goal $g$ is not directly observed; instead, the system maintains a belief $b(g)$ that is recursively updated based on a stochastic observation model derived from user actions: $p(g \mid \xi) \propto p(\xi \mid g) p(g)$ where $\xi$ is the history of user inputs and $p(\xi \mid g)$ is the probability that these observed actions are generated by a user targeting $g$ . This belief state drives downstream action selection.

The system's cost function typically includes penalties reflecting deviation from user input, time-to-goal, or explicit safety constraints. Because the cost-to-go cannot be computed exactly due to the combinatorial explosion over possible future goal assignments, the optimal robot policy is always computed in expectation over the goal belief, with further approximations to ensure tractability.

2. Intent Prediction and Goal Inference

To predict the user’s intent, maximum entropy inverse optimal control (MaxEnt IOC) is leveraged: the probability of observing a given user trajectory $\xi$ under goal $g$ is

$p(\xi \mid g) \propto \exp(-C(\xi))$

where $C(\xi)$ is the cumulative cost evaluated along $\xi$ . The system recursively updates the goal belief using the likelihoods of the user’s recent actions, and exploits soft-min operators to efficiently approximate the likelihoods over a continuous control space: $V(x) = - \log \int \exp(-Q(x, u)) du$

$p_t(u \mid x, g) = \exp(V(x) - Q(x, u))$

The belief is used not just for goal selection but as direct input to planning—allowing the system to hedge across multiple plausible user goals during assistance.

3. Hindsight Optimization for Belief-Space Planning

Solving the full POMDP is intractable for real-time applications. The system uses “QMDP” or hindsight optimization, which assumes goal uncertainty will be resolved after the next action. Thus, the robot’s action-value function is: $Q(b, a, u) = \sum_g b(g) Q_g(x, a, u)$ where $Q_g$ is the cost-to-go under the (known) goal $g$ . The system therefore selects robot actions that minimize the expected cost-to-go across the entire distribution over goals, not just the single most likely one. In variants, if the robot “takes over”, planning assumes no further human control, i.e., $u_t = 0$ .

This contrasts with the common “predict-then-blend” approach, which commits to a single likely goal and blends its next action with the user’s input based on a confidence score. Hindsight optimization enables continuous, responsive assistance from the onset of the task.

4. User Study Findings: Performance and Control Trade-offs

Experimental evaluation on shared manipulation tasks (e.g., object grasping with a robot arm) found that the uncertainty-aware method using hindsight optimization led to:

Reduced task completion times: users finished tasks faster when the system helped for all likely goals.
Lower cumulative user input: less physical effort was required from the user.
Mixed subjective impressions: despite improved objective metrics, some users found the policy method to "take over" too much, leading to a decreased sense of control compared to the predict-then-blend approach, even if the robot never deviated from their inferred goal.

The table below highlights this tradeoff:

Method	Completion Time	User Input Effort	User Perceived Control
Hindsight Optimization	Lower	Lower	Lower
Predict-then-Blend	Higher	Higher	Higher

While hindsight optimization improves external efficiency measures, the perceived authority of the human is sometimes compromised—a crucial consideration in system usability and acceptance.

5. Trade-offs, Limitations, and Design Implications

The adoption of belief-distribution-based assistance mechanisms introduces an inherent trade-off between task efficiency and user satisfaction with respect to control authority. Key observations are:

Some users prefer to maintain higher personal control, even at the cost of less efficient task execution. For these users, a system that responds to the full goal belief can appear overly assertive.
Others are willing to adapt to a system that maximizes efficiency, learning to communicate their intent with minimal intervention.
System designers may need to personalize autonomy blending strategies, either explicitly allowing users to tune their preferred autonomy level or by adaptively learning blending weights from user feedback.

Future directions include embedding user preference models within the cost function, or enabling dynamic arbitration between autonomy and teleoperation depending on real-time measures of user confidence, task progress, or saliency of perceived control.

6. Broader Impact and Current Research Directions

The uncertainty-aware shared autonomy framework, by explicitly modeling the belief over user goals and using this distribution to optimize assistance, has been foundational in subsequent research for both manipulation and navigation. Featured variants include:

Generalization to human-robot teaming (Javdani et al., 2017)
Robustness to model misspecification and intent misspecification (Zurek et al., 2021)
Quantitative system-level robustness evaluation (Deglurkar et al., 15 Oct 2024)
Extension to risk-averse planning in dynamic, multi-agent, and safety-critical domains (Naghshvar et al., 2018, Zhang et al., 1 Nov 2024, Yu et al., 26 May 2025)

The QMDP-based approach is now a reference design choice for shared autonomy applications in the presence of intent ambiguity, especially when user oversight and rapid adaptation to user intent are required alongside robust, real-time robot behavior.

7. Summary

Uncertainty-aware shared autonomy systems fuse user input and robot planning in a Bayesian framework, modeling user intent as latent, maintaining an updated belief over possible goals, and employing approximation (QMDP/hindsight optimization) for belief-space planning to select robot actions that minimize expected cost. Experiments show improvements in efficiency and user effort, while exposing crucial design trade-offs regarding perceived control. Subsequent research continues to refine arbitration strategies, intent inference, and robustness, often with user satisfaction as an explicit objective alongside task performance.