GateFrame: Normative Policy Gating
- GateFrame is a normative framework that formalizes policy gating via entropy-regularized free energy minimization, integrating decision theory, neuroscience, and machine learning.
- It employs a closed-form softmax solution to optimize mixing weights over a library of primitive policies, ensuring strong convexity and unique optimality.
- GateFrame underpins the GateMod suite by linking mathematical free-energy decomposition with algorithmic (GateFlow) and biologically-plausible (GateNet) implementations for adaptive control.
GateFrame is a normative framework for policy gating based on minimizing free energy, providing a unifying principle to describe and analyze the selection and composition of policies in decision-making, neuroscience, and machine learning. Central to the GateMod suite, GateFrame formalizes gating as the entropy-regularized minimization of Kullback–Leibler divergence between induced and desired agent-environment dynamics, parameterized by mixing weights over a library of primitive policies. Its convex structure, closed-form softmax gating rule, and principled decomposition extend across diverse domains, from cognitive models to engineering control (Rossi et al., 4 Dec 2025).
1. Mathematical Definition and Optimization Problem
GateFrame targets the policy mixture
where each is a primitive policy from a finite library, and are nonnegative mixing weights constrained to the simplex . The optimization seeks minimizing the entropy-regularized KL divergence
where is the resulting distribution under the mixture, is a generative model encoding desired dynamics or task constraints, is the Shannon entropy, and the temperature. Because the mapping is convex and is strictly concave, GateFrame is a strongly convex program with guaranteed unique solutions.
2. Free-Energy Decomposition and Objective Structure
A standard KL decomposition of GateFrame's objective, given an exponentially tilted with respect to a cost (i.e., ), yields: where is constant with respect to . Thus, the optimization reduces to: with
GateFrame thus realizes an entropy-regularized free energy minimization that simultaneously penalizes expected cost and divergence from a generative prior, connecting frameworks such as active inference, maximum-entropy RL, and KL-control under a common principle.
3. Closed-Form Softmax Gating Solution
GateFrame admits a closed-form solution for the optimal gating, derived via Lagrangian methods on the simplex. Setting the stationarity conditions for and solving leads to: for each primitive . The logit scores encode, for each primitive, the marginal gain in free energy from adjusting its weight, rendering the solution interpretable as softmax policy arbitration. As , the solution reduces to hard argmax selection; as , the weights approach uniform mixing.
4. Influence of Task Structure on Gating
The form of embeds all task-specific structure, determined by:
- the generative model ,
- the cost function or any learned dynamics biases,
- the environment transition kernel .
Practically, the gradient involves computing expected log-likelihoods and costs under each primitive: These context-dependent terms ensure that gating weights adapt online to both environmental conditions and demands of the given task, producing flexible, interpretable selection of policies according to moment-to-moment utility and generative fit.
5. Normative Properties and Theoretical Guarantees
GateFrame exhibits several notable normative features:
- Strong convexity and uniqueness: The entropy term ensures a unique optimum for on .
- Principled optimality: Solutions minimize a well-motivated free-energy functional unifying multiple frameworks in decision theory.
- Interpretability: Gating logits measure each primitive’s mismatch to the task-driven generative model or expected costs.
- Continuity: Varying interpolates between hard selection and equivocal softmax arbitration.
- Framework generality: Any policy set , cost structure, or environment model can be accommodated, making GateFrame broadly applicable across neuroscience, cognition, and engineered controllers.
6. Connections to GateFlow and Neural Realization via GateNet
GateFrame provides the normative foundation for two subsequent realizations in GateMod:
- GateFlow is a continuous-time proximal-gradient ODE,
whose unique, globally exponentially stable equilibrium is the GateFrame solution . Its vector field keeps trajectories within and ensures strict, monotonic decrease of the cost functional at rate .
- GateNet implements GateFlow as a biologically-plausible recurrent circuit. The network comprises two modules: a fast stage computing using local, contextual (Sigma-Pi) computations with log/linear activations, and a slow stage performing softmax normalization via exponentiation and normalization. All neurons obey nonnegativity constraints (interpretable as firing rates) and exchange only local information.
This succession—GateFrame (normative), GateFlow (algorithmic), and GateNet (mechanistic)—establishes a rigorous pipeline from free-energy-based gating objectives down to dynamical and neural implementations, supporting both interpretability and cross-domain applicability (Rossi et al., 4 Dec 2025).