Active Inference: A method for Phenotyping Agency in AI systems?

Published 25 Apr 2026 in cs.AI | (2604.23278v1)

Abstract: The proliferation of agentic artificial intelligence has outpaced the conceptual tools needed to characterize agency in computational systems. Prevailing definitions mainly rely on autonomy and goal-directedness. Here, we argue for a minimal notion open to principled inspection given three criteria: intentionality as action grounded in beliefs and desires, rationality as normatively coherent action entailed by a world model, and explainability as action causally traceable to internal states; we subsequently instantiate these as a partially observable Markov decision process under a variational framework wherein posterior beliefs, prior preferences, and the minimization of expected free energy jointly constitute an agentic action chain. Using a canonical T-maze paradigm, we evidence how empowerment, formulated as the channel capacity between actions and anticipated observations, serves as an operational metric that distinguishes zero-, intermediate-, and high-agency phenotypes through structural manipulations of the generative model. We conclude by arguing that as agents engage in epistemic foraging to resolve ambiguity, the governance controls that remain effective must shift systematically from external constraints to the internal modulation of prior preferences, offering a principled, variational bridge from computational phenotyping to AI governance strategy

Abstract PDF Upgrade to Chat

Authors (7)

Summary

The paper introduces a framework that redefines AI agency by integrating intentionality, rationality, and explainability.
It employs active inference with variational Bayesian methods to operationalize agency via empowerment metrics in simulated environments.
It demonstrates applications in a T-maze task, highlighting significant implications for AI governance and future AGI development.

Active Inference as a Framework for Phenotyping Agency in AI Systems

Motivation and Definition of Agency

This paper interrogates the inadequacy of prevailing definitions of agency in AI, which typically emphasize only autonomy and goal-directedness. The authors advocate for a definition grounded in intentionality, rationality, and explainability, consistent with classical philosophical approaches. Intentionality reflects belief and desire-driven action, rationality compels normatively coherent decisions given a world model, and explainability ensures causality between internal states and observed behavior. The resulting taxonomy provides a minimal yet operational foundation for measuring agency in computational systems, enabling rigorous inspection and computational phenotyping.

Active Inference and Agency Realization

The instantiation of these philosophical criteria is realized via active inference, a variational Bayesian framework originally developed in theoretical neurobiology. Within this architecture, an agent's posterior beliefs encode its internal representations (“beliefs”), prior preferences encode desired outcomes (“desires”), and policy selection is achieved through minimization of expected free energy (EFE), integrating both instrumental and epistemic value. This process creates an agentic action chain—belief updates, preference-weighted EFE computation, policy selection, action—that directly satisfies the intentionality, rationality, and explainability criteria.

Active inference agents exhibit representational intentionality, satisfying Humean and Davidsonian philosophical requirements. Preferences are not mere reinforcement signals but explicit probability distributions over sensory states, ensuring a computational grounding for intentional stance. Rationality is guaranteed by variational optimization; actions probabilistically follow from internal states and generative models, and bounded rationality is implemented via a tractable variational bound rather than exact Bayesian posteriors. Explainability is achieved through mechanistic and semantic transparency: each step in the agentic chain is accessible, interpretable, and causally traceable within the generative model, contrasting sharply with end-to-end deep learning policies.

Empowerment as a Metric for Phenotyping Agency

A central innovation is the operationalization of agency phenotypes via empowerment, defined as the channel capacity between actions and anticipated observations. Empowerment quantifies the agent’s degree of control over its environment and differentiates zero-, intermediate-, and high-agency phenotypes through structural manipulations of the generative model.

The authors implement a minimal T-maze task, formalized as a two-step POMDP, to demonstrate the approach. The paradigm is structured so that epistemic action (“cue”) yields information gain, resolving uncertainty and increasing empowerment. Intermediate-agency corresponds to submaximal empowerment ( $\log_2(2) = 1$ bit), attributable to unresolved ambiguity. Low-agency is realized in “trap” states where all actions yield the same outcome (empowerment $= 0$ ). High-agency (maximal empowerment, $\log_2(3) \approx 1.585$ bits) arises when epistemic action resolves uncertainty, differentiating all action-outcome pairs. The empowerment metric is further dissected into objective, subjective, and actual components, offering nuanced evaluation of agentic capacity as a function of both environmental structure and internal model accuracy.

Governance Implications and Theoretical Insights

The paper articulates explicit governance implications derived from empowerment-based agency phenotyping. As empowerment increases, effective governance transitions from external, structural controls (zero-agency) to preference shaping (intermediate-agency) and ultimately to internalist modulation (high-agency)—requiring engagement with the agent’s internal model, preferences, or normative priors. This principled phenomenological approach provides a variational bridge from computational agency measurement to AI governance strategies. The authors claim that contemporary governance frameworks will ultimately fail to address agency in advanced AI unless they incorporate mechanisms for modulating internal models and preferences.

Furthermore, the framework has implications for AGI development, endorsing active inference as a candidate architecture for achieving general agentic capabilities. Intentionality, rationality, and explainability are necessary prerequisites for systems capable of real-world reasoning, autonomy, and risk-sensitive operational independence. As agentic AI systems advance, the societal implications proliferate: future governance must address agents capable of independent goal-setting and domain generalization.

Future Directions

This conceptual framework invites several lines of future research. The operationalization of empowerment metrics could be extended to larger, more complex environments and hierarchically structured generative models. Work toward disentangled latent states and monosemantic planning representations would enhance explainability and facilitate causal tracing of agentic behaviors. Preference-motivated exploration and information gain targeting specific modalities offer a refined approach to balancing epistemic and instrumental value. Finally, research integrating agency phenotyping with policy shaping and internal governance mechanisms holds promise for robust control of highly agentic AI systems.

Conclusion

The paper establishes active inference as a robust computational framework for phenotyping agency in AI, operationalizes empowerment as a discriminative metric, and delineates the governance strategies appropriate to varying levels of agency. By tying philosophical concepts of agency to formal generative models, it facilitates principled measurement and tuning of agentic traits in artificial systems. The approach provides a foundational bridge between computational psychiatry, AGI development, and AI governance, with clear implications for measuring, controlling, and interpreting agency as AI systems become increasingly autonomous and general (2604.23278).

Markdown Report Issue