LLM Economist Framework
- LLM Economist Framework is an advanced simulation architecture that uses large language models to mimic economic agents and planner interactions.
- It integrates demographic realism, in-context reinforcement learning, and natural language policy design to optimize tax policies and welfare outcomes.
- Empirical findings show convergence near Stackelberg equilibria and significant welfare improvements, validating its use for realistic policy evaluation.
A LLM Economist Framework is an advanced computational architecture that leverages recent breakthroughs in generative LLMs to simulate, design, and evaluate economic policies and behavior within agent-based and hierarchical decision-making environments. This framework integrates demographically realistic agent populations, in-context reinforcement learning (ICRL), and natural language mechanism design to model the dynamic interplay between policy planners and bounded rational agents. It addresses classical economic challenges—such as the optimization of heterogeneous utilities and the credible simulation of large-scale strategic environments—entirely within the paradigm of generative, text-driven computation (Karten et al., 21 Jul 2025).
1. Framework Structure and Stackelberg Game Formulation
The LLM Economist is structured as a two-level Stackelberg game involving:
- Worker agents at the lower level, each modeled as an LLM-based agent instantiated with persona-conditioned prompts. These prompts encode census-calibrated demographic, occupational, and attitudinal factors, yielding a realistic population distribution. Each worker chooses a labor supply (in hours per week) to maximize a utility function defined in natural language and updated contextually.
- A planner agent at the upper level, also operating entirely in language, that proposes piecewise-linear marginal tax schedules (e.g., DELTA vectors parameterized as JSON snippets) anchored to actual United States federal tax brackets. The planner observes aggregate outcomes (income histograms, welfare distributions, historical best-performance records), and updates tax policies across tax years in a sequential, Markovian fashion.
Simulation time is partitioned into discrete intervals ("tax years") to ensure convergence and adaptation, with the planner's interventions and worker responses evolving sequentially.
2. Agent-Based Modeling with Persona-Conditioned Prompts
Agents are generated to exhibit heterogeneity in skills, preferences, and economic attitudes:
- Population construction relies on draws from a Generalized-Beta distribution, calibrated using American Community Survey (ACS) data to match real-world income distributions and labor attributes.
- Persona-conditioned prompts include explicit biographical details—age, occupation, ideological attitudes toward taxation (e.g., "success should be rewarded" versus "taxes are a civic duty").
- This setup enables the simulation of large-scale heterogeneous-agent environments in which labor supply and behavioral responses to policy are both demographically and attitudinally grounded.
Table: Agent Construction Elements
Attribute | Source/Calibration | Function in Prompt |
---|---|---|
Income skill | Generalized-Beta, ACS data | Varies labor supply |
Age, career | Census microdata | Shapes bias toward taxes |
Attitude | Manually specified (entrepreneur, teacher) | Utility curve adjustment |
3. Mechanism Design through In-Context Reinforcement Learning
Policy optimization and mechanism design are achieved through in-context RL:
- Tax schedules are designed in language: the planner agent generates adjustments to marginal rates in a piecewise-linear structure, encoded as JSON snippets modifying the federal bracket schedule.
- Learning process: The planner’s prompt incorporates guidance for “exploration” (searching new configurations) and “exploitation” (retaining historically optimal policies). The planner updates its policy only at the start of each tax year, observing aggregate labor and utility outcomes before the next update.
- Nudging as mechanism design: This language-driven approach sidesteps the need for explicit gradient-based optimization, instead treating the policy search as a sequence of text-based proposals conditioned on historical state and reward signals.
Mechanism design is "the ultimate nudging problem," solved naturally in-token space, with the planner balancing efficiency and redistribution via adaptive bracket proposals.
4. Optimization of Heterogeneous Utilities
At the micro level, each agent maximizes a utility function of the form:
where is post-tax income, is labor supply, and , , and are agent-specific preference parameters (risk aversion and disutility weights). Additional dissatisfaction penalties apply in bounded utility cases.
At the macro level, the planner agent maximizes a social welfare function (SWF) of the form:
with commonly used weights emphasizing redistribution.
Unlike approaches relying on fixed, exogenous elasticities (e.g., traditional Saez-type optimal tax theory), this framework allows labor supply elasticities and behavioral parameters to be endogenously re-estimated via in-context adaptation over repeated tax years.
5. Empirical Findings and System Dynamics
Key experimental outcomes from simulations with up to agents and steps partitioned into -step tax years include:
- The planner consistently converges near Stackelberg equilibria. Appropriate choice of tax-year duration is critical: allows convergence; shorter impedes adaptation.
- Welfare improvements approach or exceed increases predicted by classical Saez formulas. For bounded-utility specifications, aggregate welfare gains approached 90% improvements relative to baseline U.S. statutory schedules.
- Tax schedules learned by the planner closely approximate those generated via regression methods on the Saez formula, demonstrating the effectiveness of natural-language mechanism design even in highly heterogeneous populations.
- Introduction of periodic persona-level voting (worker agents elect or replace the planner) induces political phenomena such as majoritarian exploitation and welfare-improving electoral turnover, illustrating the platform’s ability to simulate both economic and political equilibria.
6. In-Context Reinforcement Learning and Decision-Making Dynamics
Both levels of the Stackelberg game employ in-context learning:
- Planner level: Each decision is rendered in language, with historical welfare records, population outcome histograms, and prompt-engineered signals (explore/exploit).
- Agent level: Worker agents update their strategies (“satisfaction flags”) in response to post-tax income and utility changes, with feedback histories encoded in prompt memory.
- No explicit parameterized reward function is specified; adaptation emerges naturally from prompt structures and token-space reasoning.
- Prompt ablation studies indicate that exploitation cues are necessary for robust convergence to high-welfare equilibria.
7. Policy Evaluation, Transparency, and Broader Implications
The LLM Economist framework establishes a tractable testbed for societal-scale policy evaluation:
- Transparency: All agent and planner decisions, as well as relevant reasoning and responses, are available for audit in natural language form, enabling post-hoc inspection and qualitative assessment by economists or policymakers.
- Flexibility: The heterogeneous-agent structure allows for fine-grained exploration of policy interventions, behavioral heterogeneity, and distributional consequences.
- Societal simulation: By enabling the paper of emergent economic and political behaviors (e.g., voting, majority-exploitation, policy swings), the framework facilitates richer, more realistic ex ante evaluation of reforms—a capacity beyond traditional equilibrium models.
- Interpretability and scalability: Every entity (worker or planner) is configured and interpreted through transparent, traceable natural-language protocols. Scalability to agent populations is demonstrated, with further scaling plausible.
Conclusion
The LLM Economist framework unifies agent-based microfoundations, language-driven mechanism design, and in-context RL into a scalable architecture for economic policy analysis. It optimizes heterogeneous agent utilities and dynamically learns realistic intervention mechanisms under real-world demographic constraints, all within a transparent and interpretable natural-language environment. This approach represents a foundational advance in the computational simulation and evaluation of fiscal and redistribution policy, offering a tractable platform for holistic experimentation with policy interventions at societal scale (Karten et al., 21 Jul 2025).