Open-Universe Assistance Games

Updated 24 August 2025

OU-AGs are a framework where an AI assistant infers, tracks, and acts on dynamically specified human goals communicated in natural language.
They integrate open-universe modeling, POMDPs, and cooperative game theory with language-based probabilistic inference to manage deep goal uncertainty.
Empirical evaluations in domains like robotics, grocery shopping, and Minecraft show improved collaboration metrics and reduced human effort.

Open-Universe Assistance Games (OU-AGs) are a class of multi-agent sequential decision problems in which an AI assistant must infer, track, and act on human goals that originate from an unbounded, possibly evolving space. Unlike settings in which all possible goals are predefined, OU-AGs formalize collaboration in domains where the agent's prior knowledge of user intent is incomplete, goals are often communicated in natural language, and specification must occur dynamically during interaction. This framework integrates and advances concepts from open-universe modeling, POMDPs, cooperative game theory, and human-in-the-loop AI to rigorously address decision-making under deep goal uncertainty.

1. Formal Framework of Open-Universe Assistance Games

OU-AGs generalize traditional assistance games by explicitly allowing the state space of possible goals to be open, unbounded, and defined dynamically through human–AI interaction (Ma et al., 20 Aug 2025). In contrast to frameworks such as Cooperative Inverse Reinforcement Learning (CIRL), which operate over a fixed set of reward parameters, OU-AGs represent the world as containing an evolving set of human preference types, typically communicated via natural language. The state in an OU-AG therefore comprises both the environment configuration and a possibly unbounded latent goal set:

$(s, \{\theta_t^i\})$

where $s$ is the environment state and $\theta_t^i$ are current candidate human goal types.

The formal model can be seen as an Open-Universe POMDP (OU-POMDP), whose tuple is:

$\langle S, A, O, T, \Omega, r \rangle$

where $S$ is the world and goal state, $A$ is the joint action space, $O$ the observations, $T$ the transition dynamics, $\Omega$ the observation model, and $r$ is the reward function:

$r(s, g) = \begin{cases} 1 & \text{if } s \in S_g \ 0 & \text{otherwise} \end{cases}$

This approach allows for goals that are not known a priori and can be specified, revised, or refined through dialogue during the episode.

2. Algorithms and Probabilistic Goal Inference

A central challenge in OU-AGs is maintaining and updating a belief over an unbounded goal space. The GOOD (GOals from Open-ended Dialogue) method addresses this by leveraging a LLM to continually extract and maintain candidate goals from the interaction transcript (Ma et al., 20 Aug 2025). GOOD consists of:

Goal Proposal Module: Prompts an LLM to propose and refine candidate goal sets in response to the latest dialogue, dynamically tracking user intent.
Removal/Pruning Module: Prunes goals that become unlikely or satisfied, using ranking thresholds to limit the hypothesis set.
Ranking/Inference Module: Assigns probabilistic scores to candidate goals based on pairwise LLM-based plausibility comparisons.

Probabilistic inference uses a Beta distribution to represent confidence in each goal:

$\text{mean} = \frac{\alpha}{\alpha + \beta}$

where $\alpha$ and $\beta$ are the number of LLM "wins" and "losses" for each candidate in pairwise comparisons. Candidates are retained if they exceed a probability threshold (e.g., $>85\%$ ) and have low variance.

3. Diegetic Feedback and Internalized Counterfactual Analysis

Diegetic feedback mechanisms extend the formalism by embedding counterfactual reasoning and equilibrium computation within the game’s own dynamics (Capucci, 2022). This approach eliminates the need for extrinsic equilibrium predicates, instead using selection functions and functorial lens constructions to handle payoff propagation and counterfactual deviation analysis.

The payoff function is "lifted" via a functor $P^*$ to a lens that allows backward propagation (coplay mechanism) of global payoff information.
Selection lenses incorporate each player’s preferences directly into the feedback flow, enabling players to evaluate unilateral deviations as intrinsic narrative processes.

A fixpoint behavior in the overall parametric lens system corresponds exactly to a Nash equilibrium:

$\omega \in {}_{(\Omega,u)}(\omega) \Longleftrightarrow \omega \text{ is a Nash equilibrium}$

where ${}_{(\Omega,u)}$ maps strategy profiles to their best-response sets via diegetically propagated payoff information.

4. Scalable Solution Methods for High-Dimensional OU-AGs

The AssistanceZero algorithm demonstrates that OU-AGs are tractable even in environments with combinatorially vast goal spaces (Laidlaw et al., 9 Apr 2025). AssistanceZero extends AlphaZero by introducing neural network "heads" for:

Reward Parameter Prediction: Forms beliefs over latent goals $\theta$ from historical interaction.
Human Action Prediction: Forecasts the user’s likely next actions given state and dialogue history.

The loss function for AssistanceZero is a weighted sum: \begin{align*} L(\phi) = \frac{1}{n} \sum_t [ & \lambda_\text{policy} D_\text{KL}(\pi_t^{{\text{MCTS}}} | \pi^{\phi(\cdot|h_t))} \ + & \lambda_\text{value} (V^\phi(h_t) - \sum_{t'} \gamma^{t'-t} R(s_{t'}, a^H_{t'}, a^R_{t'}; \theta))² \

& \lambda_\text{reward} \log p^{\phi(\theta|h_t)} \
& \lambda_\text{prev-rew} D_\text{KL}(p^{\phi(\theta|h_t)} || p_t(\theta)) \
& \lambda_\text{action} \log p^{\phi(a^H_t} | h_t) ] \end{align*}

Monte Carlo tree search is run over full interaction histories (not just environmental states), making planning robust under deep uncertainty. Empirical results in complex environments (e.g., Minecraft-based assistance games with $>10^{400}$ possible goals) show significant improvements in collaboration metrics such as fraction of completed goals and reduction in required human effort.

5. Evaluation Domains and Empirical Results

OU-AGs have been instantiated and evaluated in domains encompassing both language-centric and embodiment-centric settings (Ma et al., 20 Aug 2025, Laidlaw et al., 9 Apr 2025):

Domain	Task Type	Agent Output	Main Metric
Grocery Shopping	Text-based, profile-driven	Complete cart list	Cart Score
Home Robotics	Physical/simulated actions	Action sequence	Action Score
Minecraft Build	3D/Embodied collaboration	Structure construction	% Structure, User Action Reduction

Agents utilizing explicit goal inference (GOOD or AssistanceZero) consistently outperform baselines that do not track goal hypotheses, leading to higher Cart and Action Scores and improved subjective ratings of helpfulness. For example, in Minecraft, AssistanceZero-trained assistants reduced human partner actions by over 40% compared to model-free RL baselines, and in household/robotics settings, GOOD increased task completion alignment as measured by both human and LLM-based judges.

6. Interpretability and Human-AI Interaction

A salient property of the OU-AG formalism is interpretability: agents are able to represent and expose their internal hypotheses about human intent at any stage, as the goal set $\{\theta_t^i\}$ is maintained in explicit language form (Ma et al., 20 Aug 2025). This capability supports:

Corrigible and safe interactions, as agents can "explain" their goals to human users and be audited for alignment and ambiguity.
Reduced specification effort for designers; open-ended language goals remove the necessity for exhaustively pre-specified reward sets.

The use of probabilistic inference and Beta distributions for goal ranking further enables transparent communication of uncertainty.

7. Connections to Broader Theories and Extensions

OU-AGs synthesize advances from formal lens-based game theory, diegetic feedback, scalable model-based planning, and natural language processing. The functorial construction developed in (Capucci, 2022) provides a compositional backbone that unites agency, learning, and counterfactual dynamics. AssistanceZero's architecture demonstrates practical tractability at scale (Laidlaw et al., 9 Apr 2025), while GOOD formalizes efficient and interpretable goal management (Ma et al., 20 Aug 2025).

A plausible implication is that OU-AGs can serve as an abstraction not only for embodied collaboration but also for general LLM assistant training, where goal statements are open, evolving, and communicable. They offer a rigorous, unified foundation for the design, training, and evaluation of next-generation human-centered AI systems.