Goal Recognition: Methods and Applications

Updated 4 October 2025

Goal recognition is the computational task of inferring an agent’s objectives from limited, noisy observations using formal planning and probabilistic models.
It leverages techniques such as cost-based planning, landmark heuristics, Bayesian inference, and deep learning to handle uncertainty and dynamic contexts.
Applications span human-robot interaction, autonomous driving, and surveillance, where explainability enhances user trust and practical deployment.

Goal Recognition (GR) is the computational problem of inferring an agent’s objectives from partial, noisy, or ambiguous observations of its behavior within a specific environment or domain. This task is central to the construction of intelligent and interactive systems—ranging from surveillance, human-robot interaction, and autonomous driving to gaming and adaptive user interfaces—where understanding or anticipating intentions is critical for effective response or intervention. Goal recognition has evolved from early cost-based planning approaches to contemporary techniques integrating learning, probabilistic inference, dynamic adaptation, and explainability.

1. Formal Definitions, Problem Scope, and Key Models

In its canonical form, a goal recognition problem is defined by the tuple $\langle T, G, O \rangle$ , where $T$ is the domain theory (typically an MDP or planning model), $G$ is a set of candidate goals, and $O$ is a (possibly partial, noisy, or temporally extended) sequence of observations. The output is an inferred goal $g^* \in G$ that best explains $O$ . Recent work formalizes online and dynamic GR as sequences $\langle T, \langle G^i, \{O\}^i \rangle_{i=1}^n \rangle$ , where $G^i$ and $O^i$ may evolve over time, enabling GR in changing contexts (Matan et al., 27 Sep 2025).

The landscape of goal recognition encompasses several modeling paradigms:

Plan Recognition as Planning (PRP): Hypotheses about goals are evaluated by comparing costs of optimal plans versus plans constrained to explain observations. The difference, or "cost difference," serves as a compatibility measure (Pereira, 2020).
Probabilistic Plan Recognition: Bayesian frameworks compute posterior probabilities over goals, using plan costs to define likelihoods (e.g., $P(Obs|G) = [1 + \epsilon]^{-1}$ , where $\epsilon$ is the incremental cost to explain $Obs$ given $G$ ) (Pereira, 2020).
Operator-Counting and LP-based Approaches: Heuristic estimation via integer or linear programming, with operator-counting and observation-counting constraints, directly encodes the plan structure and observability/noise (Santos et al., 2019, Meneguzzi et al., 11 Apr 2024).
Landmark Methods: Utilize unavoidable "milestones" in the plan space—propositions or actions that must be satisfied en route to a goal—to filter or score candidate goals efficiently (Pereira et al., 2019).
Learning-Based Methods: Rely on deep learning, reinforcement learning, or metric learning to process observation traces and perform GR without full domain models (Amado et al., 2022, Chiari et al., 2022, Shamir et al., 6 May 2025).

2. Algorithmic Methodologies and Representations

Planning-Based and Heuristic Approaches

In classical settings, GR is framed as searching for (near-)optimal plans that align with observed behaviors. Operator-counting frameworks formulate IP/LPs: $\min \sum_{o \in O} \text{cost}(o) \cdot Y_o \qquad \text{s.t. constraints (plan validity, observation coverage)}$ with extensions to observation-counting, sensor unreliability, and landmark constraints to handle partial and noisy data (Santos et al., 2019, Meneguzzi et al., 11 Apr 2024).

Landmark-based heuristics add structural pruning: if a landmark required by a goal is not observed, or is contradicted by the sequence, that goal is downweighted or eliminated (Pereira et al., 2019). New LP constraints further bind operator counts to observed landmarks, producing heuristics that tightly lower bound valid plan costs while robustly handling sensor noise (Meneguzzi et al., 11 Apr 2024).

Probabilistic and Bayesian Inference

Probabilistic plan recognition uses cost-based likelihoods and Bayesian updating: $P(G|Obs) \propto P(Obs|G) P(G)$ where $P(Obs|G)$ is estimated via plan cost mismatch, and $P(G)$ can be uniform or, more realistically, weighted by solvability/easiness (Pereira, 2020, Zhang et al., 16 Feb 2024). Recent work shows that action information dominates human inferences, but timing ("thinking time") and goal solvability also influence $P(G)$ , motivating "Easiness Priors" and likelihood decompositions: $P(G | O) \propto Prior(G) \cdot LL_a(O, G) \cdot LL_t(O, G)$ where $LL_a$ scores action compatibility and $LL_t$ incorporates timing features (Zhang et al., 16 Feb 2024).

Deep/Reinforcement Learning and Metric Learning

Model-free RL approaches learn goal-parameterized policies or $Q$ -functions offline. At inference, the observation trace is evaluated against each $Q_g$ using metrics such as the summed utility (MaxUtil), KL-divergence from a softmax policy, or divergence points (Amado et al., 2022). Transfer learning and policy aggregation enable scalability to dynamic or previously unseen goal sets (Shamir et al., 23 Jul 2024, Shamir et al., 6 May 2025). Metric learning frameworks (e.g., GRAML) employ Siamese networks to embed observation traces into a goal-discriminative latent space, supporting rapid one-shot adaptation to new goals (Shamir et al., 6 May 2025).

Fact Probability Vector (FPV) approaches map both observed facts and expected fact probabilities (conditioned on candidate goals) into a real vector space. By computing the $l_2$ distance between the observed state, initial state, and these goal-specific vectors, GR can be performed efficiently even in low-observability or large-scale settings (Wilken et al., 26 Aug 2024).

3. Observability, Noise, and Real-Time Adaptation

Addressing partial and noisy observations is central. LP/IP-based approaches calibrate for observation sparsity and sensor unreliability, relaxing constraints to ignore a fraction $\epsilon$ of observations that may be erroneous (Santos et al., 2019, Meneguzzi et al., 11 Apr 2024). Uncertainty factors (e.g., scaling costs with an uncertainty ratio $\mu$ ) allow selection of robust candidate sets under incomplete data.

New paradigms center on online, dynamic, and adaptive GR, where the goal set changes and observations arrive incrementally. Transfer learning for Q-functions or metric embeddings supports rapid adaptation (1.5 seconds per new goal versus hundreds of seconds for re-training), enabling practical deployment in settings such as navigation, human-robot teaming, and surveillance (Shamir et al., 23 Jul 2024, Elhadad et al., 14 May 2025).

Standardized frameworks, including open-source libraries (gr-libs, gr-envs), now provide unified benchmarks, Gym-compatible environments, and diagnostics for systematic evaluation of dynamic and online GR algorithms (Matan et al., 27 Sep 2025).

4. Explainable and Human-Centered Goal Recognition

Recent efforts emphasize explainability via human-centered models. The eXplainable Goal Recognition (XGR) model leverages the Weight of Evidence (WoE) framework: $woe(g/g' : o_i | \mathcal{O}) = \log \frac{P(g | \mathcal{O}_i)}{P(g' |\mathcal{O}_i)}$ to identify "observational markers"—the actions or features most diagnostic for preferring one goal over alternatives. The framework generates both "why" and "why not" explanations, highlighting actions supporting the recognized goal and counterfactuals that would support alternatives (Alshehri et al., 2023, Alshehri et al., 18 Sep 2024). Empirical results from human studies indicate substantial gains in user understanding and trust when XGR explanations accompany GR outputs. Theoretical grounding in cognitive science supports the model's alignment with human explanatory practices.

In the context of embodied or safety-critical domains (e.g., autonomous driving), interpretable and verifiable models (e.g., decision trees with indicator variables in OGRIT) permit formal verification and transparent reasoning, further promoting user acceptance (Brewitt et al., 2022).

5. Domains of Application and Evaluation

Goal recognition has been applied and evaluated across a wide spectrum:

Classical Planning Domains: Blocks World, Depots, Driverlog, IPC-Grid, Sokoban, Logistics, Zenotravel—offering varied complexity, observability, and partial/noisy observation challenges (Pereira et al., 2019, Santos et al., 2019, Meneguzzi et al., 11 Apr 2024, Wilken et al., 26 Aug 2024).
Autonomous Driving: OGRIT demonstrates accurate, interpretable GR under occlusion using shallow decision trees, supported by new occlusion-annotated datasets (inDO, rounDO, OpenDDO) (Brewitt et al., 2022).
Human-Agent and Assistive Robotics: Incorporation of natural language feedback (as in the D4GR framework) improves robustness to sub-optimal actions and sensor noise (Idrees et al., 2023). Real robot deployment further validates practical viability.
Partially Observable and Continuous Domains: Hierarchical reinforcement learning with relational graphs (GRG) supports generalization to new goals/environments in object search and navigation (Ye et al., 2021). Model-free methods extend to continuous control domains (PointMaze, Parking, Panda-Gym) (Shamir et al., 6 May 2025, Matan et al., 27 Sep 2025).
Goal Recognition Design: Data-driven environment design (using wcd as the measure of observational distinctiveness) enables optimization for easier recognition even under general, suboptimal, or learned behavior models, validated by both simulation and human experiments (Kasumba et al., 3 Apr 2024).

6. Impact, Benchmarks, and Future Directions

State-of-the-art GR approaches have demonstrated substantial improvements:

LP-based and fact-probability vector methods achieve higher precision and robustness under low observability and sensor noise while reducing computational overhead (Santos et al., 2019, Meneguzzi et al., 11 Apr 2024, Wilken et al., 26 Aug 2024).
Learning-based approaches dispense with hand-crafted models, accelerating inference and bootstrapping to new goals with few or even single-shot examples (Amado et al., 2022, Shamir et al., 6 May 2025).
Explainable models (XGR) close the human-agent interpretability gap, improving satisfaction, task prediction, and trust (Alshehri et al., 2023, Alshehri et al., 18 Sep 2024).
The emergence of reproducibility-focused frameworks (gr-libs, gr-envs) and standardized evaluation protocols (Matan et al., 27 Sep 2025) addresses prior fragmentation in methodology, expediting comparative studies and accelerating community progress.

Open research frontiers include further integration with meta- and transfer learning to handle multi-domain, non-stationary, and evolving GR tasks (Elhadad et al., 14 May 2025), systematic treatment of temporally extended and abstract goal structures (e.g., with LTL/PLTL semantics (Pereira et al., 2023)), refinement of probabilistic and cognitive priors for more human-like recognition (Zhang et al., 16 Feb 2024), and extending gradient-based environment design frameworks for practical deployment (Kasumba et al., 3 Apr 2024).

The trajectory of GR research increasingly emphasizes scalability, adaptability, transparency, and alignment with end-user needs, supporting the deployment of intelligent systems capable of inferring goals efficiently, accurately, and intelligibly across a range of complex and dynamic real-world environments.