PromptGen Agent
- PromptGen Agent is a system that translates high-level narratives into fully structured agent-based model (ABM) definitions using componentwise prompts.
- It utilizes reusable QA-style prompt templates to extract agents, state variables, behaviors, interactions, and schedules into deterministic, machine-readable JSON.
- The approach integrates mathematical dynamics and pseudocode generation, enabling seamless conversion of natural language into simulation-ready ABM logic.
A PromptGen Agent, in the context of conceptual agent-based model (ABM) extraction with LLMs, is an orchestrated system designed to autonomously translate high-level narrative specifications into fully structured ABM definitions. This is accomplished through systematic, componentwise prompt engineering coupled with machine- and human-readable output formats, notably JSON, which facilitate both manual inspection and automated code generation. The approach leverages QA-style prompt templates and iterative workflows to elicit agents, their state variables, behaviors, interactions, schedules, and associated mathematical dynamics directly from the source text (Khatami et al., 2024).
1. QA-Style Prompt Engineering Templates
The core operational mechanism of the PromptGen Agent is a library of minimal, reusable QA prompts, each targeting a distinct ABM specification component:
- Agents: Prompts designed to extract agent types, their roles/purposes, and key state variables.
- State Variables: Agent-specific prompts to retrieve detailed descriptions, data types, and initialization values for all relevant state variables.
- Behaviors: Prompts aimed at cataloging behaviors/actions performed by agents, including triggers and explicit effects on state variables.
- Interactions: Structured prompts to capture pairwise/group interactions, describing participants, domain semantics, and rates/probabilities when available.
- Schedules: Specialized prompts for inferring timing mechanics (discrete, continuous, intervals, rates) associated with each behavior or interaction.
All prompts specify strict output schemas (pure JSON), facilitating deterministic extraction and post-processing. The agent supports both inline narrative analysis (“Analyze the following text…”) and attachment-based extraction (Khatami et al., 2024).
2. Worked Example: End-to-End Extraction Flow
A canonical workflow involves the following sequence:
- Narrative Ingestion: The high-level system narrative is injected as input.
- Sequential Prompting: The PromptGen Agent runs the appropriate QA-style prompt for each ABM component in turn.
- Parsing and Validation: Raw LLM outputs are parsed and validated as JSON fragments against the specified schema.
Illustrative Invocation (Predator-Prey Narrative):
- Agents output:
1 2 3 4 5 6 |
{
"agents": [
{ "name": "Wolf", "role": "predator", "attributes": ["energy","location"] },
{ "name": "Rabbit", "role": "prey", "attributes": ["age","location"] }
]
} |
- State Variables for Wolf:
1 2 3 4 5 6 7 |
{
"agent": "Wolf",
"state_variables": [
{ "name": "energy", "description": "current energy level", "data_type": "integer", "initial_value": 10 },
{ "name": "location", "description": "grid coordinates", "data_type": "tuple", "initial_value": "(0,0)" }
]
} |
- Behaviors output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
{
"behaviors": [
{
"name": "Hunt",
"trigger": "Rabbit present in same cell",
"effects": [
{"variable":"Rabbit.population","delta":"-1"},
{"variable":"Wolf.energy","delta":"+5"}
]
},
{
"name": "Reproduce",
"trigger": "at each day end",
"effects": [
{"variable":"Rabbit.population","delta":"+ births from logistic formula"}
]
}
]
} |
3. Machine-Readable JSON Schema
The process employs a hierarchical, extensible JSON schema compatible with downstream auto-code generation and simulation tools. The schema encapsulates:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
{
"agents": [
{ "name": "string", "role": "string", "attributes": ["string", "..."] }
],
"state_variables": {
"<Agent>": [
{ "name": "string", "description": "string", "data_type": "string", "initial_value": "number|string" }
]
},
"behaviors": [
{ "name": "string", "trigger": "string", "effects": [ { "variable": "string", "delta": "string" } ] }
],
"interactions": [
{ "participants": ["string", "..."], "description": "string", "rate": "number|string|null" }
],
"schedules": [
{ "name": "string", "timing_type": "string", "parameters": { "string": "number|string" } }
]
} |
This schema supports both human review and automated translation into code templates or simulation configurations (Khatami et al., 2024).
4. Mathematical Dynamics and Formula Embedding
Agent-based model dynamics are articulated using representative LaTeX formulas embedded in prompt outputs or post-processing steps:
- Logistic population growth:
- Encounter rate:
- Discrete-state transition probability:
Such formulas are attached to relevant behaviors or interactions within the JSON specification, supporting unambiguous mathematical grounding for code generation or simulation (Khatami et al., 2024).
5. Full Extraction Cycle: Prompt to Pseudocode
The PromptGen Agent executes the following inferential loop:
- Issue behavior prompt for narrative.
- Obtain LLM JSON response specifying state change formulas:
1 2 3 4 5 6 7 8 9 10 11
{ "behaviors": [ { "name": "Reproduce", "trigger": "daily", "effects": [ {"variable": "Rabbit.population", "delta": "+0.5*R*(1 - R/50)"} ] } ] } - Post-process into simulation pseudocode:
1 2 3 4
# For daily reproduction of rabbits: R = RabbitCell.count delta = 0.5 * R * (1 - R/50) RabbitCell.count += floor(delta)
This end-to-end pipeline validates the agent’s capability to convert natural language into executable ABM logic (Khatami et al., 2024).
6. Best Practices and Refinement Strategies
Empirical guidelines for achieving optimal, reproducible outputs include:
- Decomposition: Break complex narratives into single-component prompts for accuracy.
- Determinism: Use LLM temperature settings in [0.0, 0.2] to ensure schema-compliant, reproducible outputs.
- Schema Validation: Confirm every output fragment against the target JSON schema.
- Clarification Protocols: Add explicit instructions for handling missing fields and null rates.
- Self-Consistency: Repeat extractions for consistency check and merge via majority vote if necessary.
- Hierarchical Extraction: For nested properties, apply two-pass prompts: first list items, then extract detailed attributes per item.
- Explicit No-Data Handling: Instruct agent to return empty structures when a component is absent in the narrative.
These practices result in robust extraction pipelines suitable for integration with ABM simulators or code generators (Khatami et al., 2024).
7. Significance and Scope
The PromptGen Agent architecture for conceptual agent-based model extraction demonstrates that LLMs, when guided by precise, QA-style prompt patterns and structured output schemas, enable full or partial automation of ABM design. This approach bypasses traditional manual translation, expedites model construction, and aligns outputs with the requirements of both domain scientists and computational modelers. Its modular refinement addresses scalability and correctness, supporting interoperability with downstream modeling workflows and code generation frameworks (Khatami et al., 2024).