Papers
Topics
Authors
Recent
Search
2000 character limit reached

Value-Guided Construal Models

Updated 25 June 2026
  • Value-Guided Construal (VGC) models are frameworks that optimize internal representations by balancing expected utility against representational costs.
  • They are applied across domains like LLM decoding, human planning, moral reasoning, and goal-conditioned world modeling using task-specific value functions.
  • Algorithmic implementations such as MAVIS, IVO, and JEPA demonstrate efficiency gains and improved decision-making under resource constraints.

A Value-Guided Construal (VGC) model is a theoretical and algorithmic framework for adaptive representation, decision, or generation, in which the construction or selection of internal representations, policies, or outputs is explicitly optimized under task-specific value functions and resource constraints. Originating from resource-rational and bounded rationality perspectives, VGC formalizes how agents—biological or artificial—simplify complex environments, balance utility against representational cost, or dynamically steer large generative models, by learning value functions or explicit construal policies that guide which information enters into planning, inference, or output construction. In recent years, a diverse literature spanning computational cognitive science, model-based control, and LLM alignment has instantiated the VGC approach in domains ranging from human-like mental simulation to multi-objective LLM decoding and goal-conditioned world-model planning.

1. Formal Foundations of Value-Guided Construal

VGC models share a canonical structure: they define a trade-off between the utility of a representation, policy, or output—usually quantified by a value or reward function—and its cost, complexity, or resource consumption. The general objective is r=argmaxrR[U(r)βC(r)]r^* = \arg\max_{r \in \mathcal{R}} \left[U(r) - \beta C(r)\right] where rr is a representation (or construal) selected from space R\mathcal{R}, U(r)U(r) is expected task utility under rr, C(r)C(r) is a cost or complexity measure (e.g., r|r|, KL divergence, coding length), and β\beta is a trade-off parameter. This abstraction includes:

Notably, VGC is not tied to any particular format of value function or resource cost, allowing both soft-inclusion (e.g., smoothed by attentional kernels (Castanheira et al., 11 Jun 2025)) and discrete selection.

2. Value-Guided Decoding and Inference in LLMs

Recent applications of VGC in LLMs optimize over output sequences using learned value functions to steer generation towards user-specified objectives without full retraining.

Multi-Objective Alignment (MAVIS)

MAVIS (Carleton et al., 19 Aug 2025) trains a set of per-objective value models rr1—each a lightweight LM with a regression head—to estimate KL-regularized expected returns. At inference, user weights rr2 induce a tilting function:

rr3

Token-level policies are adjusted as:

rr4

Each rr5 is trained by KL-regularized policy iteration, where empirical returns penalized by log-probability ratios are regressed onto value heads. This enables post hoc adjustment of output tradeoffs among multiple goals, strict monotonic policy improvement, and expansion of the achievable Pareto frontier relative to baseline mixtures.

Iterative Value Function Optimization (IVO)

IVO (Liu et al., 4 Mar 2025) introduces a critic rr6 trained with Monte Carlo rollouts and regression, and iteratively improves the policy via:

rr7

This approach allows steering of decoding to maximize reward without updating LLM backbone weights, substantially reducing computational cost relative to RLHF. IVO achieves significant empirical gains on summarization, dialog, and instruction tasks, dominates prior value-guided sampling methods (FUDGE, ARGS, VAS), and yields favorable GPT-4 win-rates.

3. Value-Guided Construal in Human Planning and Mental Simulation

VGC has been applied to models of human planning, exemplifying the resource-rational principle that agents filter and encode only task-relevant features.

Just-in-Time (JIT) World Modeling

JIT planning (Chen et al., 20 Jan 2026) implements VGC not via explicit search over representations, but through an interleaved simulate–lookahead–encode process. The agent maintains a working memory (construal) rr8 containing only a small subset rr9 of all possible objects or obstacles.

  • Simulation steps trigger a lookahead that identifies unencoded but soon-to-be-relevant objects.
  • Objects flagged are dynamically encoded; unused items decay probabilistically according to power-law forgetting.
  • The process supports efficient prediction and planning with high correlation to human behavioral probes, reducing average objects represented and matching or exceeding classical VGC models in variant tasks.

Efficiency is obtained by estimating need probabilities for each object via Monte Carlo over sampled trajectories, updating construals "just in time" as demanded by the evolving simulation state.

Attentional and Perceptual Modulation

Extensions incorporating visuospatial attention ("spotlight-VGC") (Castanheira et al., 11 Jun 2025) introduce soft gating over which features enter the task representation, parameterized by spatial kernels or lateralization, tuned via participant-specific attention radius. The agent's attention function R\mathcal{R}0 influences which environmental features are included in the simplified model via smoothed inclusion probabilities, accounting for human-like crowding and lateralization effects in virtual maze navigation.

4. VGC in Moral Reasoning with LLMs

VGC is also employed for moral and value-sensitive LLMs (Chakraborty et al., 17 Jun 2025). Here, the construal is instantiated as a combination of structured prompts reflecting value systems and ethical theories, eliciting chain-of-thought-style justifications and decisions.

  • A taxonomy of prompts combines psychological value frameworks (e.g., Schwartz, Moral Foundations) and explicit ethical theories (e.g., Care Ethics) to scaffold model reasoning.
  • A distillation pipeline transfers competence from large teacher models, minimizing a hybrid loss over token-by-token imitation and semantic consistency, yielding scalable, interpretable, and value-grounded reasoning in small models.
  • Structured prompting and distillation yield consistent improvements in moral decision accuracy and justification coherence over label-only baselines.

5. VGC in Goal-Conditioned World Models and Control

In model-based control, VGC formalizes how value structure shapes representation and action planning.

JEPA World Models

Destrade et al. (Destrade et al., 28 Dec 2025) introduce VGC within a Joint-Embedded Predictive Architecture (JEPA) for goal-reaching tasks.

  • The value function R\mathcal{R}1—negative cost-to-go for reaching goal R\mathcal{R}2—is approximated by R\mathcal{R}3, with R\mathcal{R}4 a Euclidean or quasi-metric distance in embedding space.
  • Training alternates or jointly optimizes a JEPA prediction loss and an Implicit Q-Learning (IQL) value loss, with expectile regression (R\mathcal{R}5) shaping the embedding geometry for effective planning.
  • Model Predictive Path Integral control (MPPI) uses these distances at test time for high-accuracy action planning, outperforming contrastive and standard regression approaches, but displays limitations in long-range calibration and stochastic settings.

6. Algorithmic Summaries and Theoretical Guarantees

Across VGC instantiations, several recurrent themes and guarantees emerge:

Domain Value Function Inference/Planning Mechanism Theoretical Guarantee
LLM Decoding KL-regularized, per-objective Exponential tilting (MAVIS), IVO top-k/beam Monotonic improvement, Pareto optimality (Carleton et al., 19 Aug 2025, Liu et al., 4 Mar 2025)
Human Planning Utility vs. complexity Greedy/spotlight representational search, JIT Efficiency, tight human fits (Chen et al., 20 Jan 2026, Castanheira et al., 11 Jun 2025)
World Modeling Negative embedding distance MPPI w/ JEPA, value-influenced control Improved planning accuracy (Destrade et al., 28 Dec 2025)
Moral Reasoning Prompt-structured value tradeoff Structured prompting, distillation Accuracy, coherence improvements (Chakraborty et al., 17 Jun 2025)

KL-regularized policy iteration in LLM contexts enjoys strict monotonic policy improvement, provable convergence to optimal token-level policies under certain bandit settings, and empirical Pareto-front dominance in multi-objective evaluation (Carleton et al., 19 Aug 2025). In perceptual VGC, resource-bounded optimization yields fit measures (e.g., human–model correlation, RMSE, log-likelihood) that closely match human data (Chen et al., 20 Jan 2026, Castanheira et al., 11 Jun 2025).

7. Limitations, Efficiency Gains, and Open Challenges

VGC models achieve substantial empirical and computational efficiency over traditional RLHF or brute-force search:

  • MAVIS and IVO require only small value heads and limited rollout sampling, yielding speedups of R\mathcal{R}6100R\mathcal{R}7 for LLM alignment (Carleton et al., 19 Aug 2025, Liu et al., 4 Mar 2025).
  • JIT and attentional VGC in perceptual domains encode significantly fewer features for similar predictive power, trading occasional planning suboptimality for memory savings (Chen et al., 20 Jan 2026).
  • In JEPA world models, value-guided construals improve planning accuracy but struggle with rare state–goal pairs and with calibration far from the goal; improvements require either hierarchical latents or more strategically curated datasets (Destrade et al., 28 Dec 2025).

Broader open issues include:

  • Extending VGC to domains with high non-stationarity or combinatorial construal spaces.
  • Scalability of representational search in environments with ambiguous or weakly-structured value signals.
  • Formal generalization bounds under resource constraints and finite-sample regimes.

References:

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Value-Guided Construal (VGC) Models.