Dice Question Streamline Icon: https://streamlinehq.com

Formal prompt-design techniques to mitigate LLM data contamination in ABM agent behaviors

Develop formal techniques for designing prompts for large language model queries used to generate agent behaviors in agent-based models, with the goal of mitigating data contamination and enabling reliable, principled LLM-driven decision sampling during simulation.

Information Square Streamline Icon: https://streamlinehq.com

Background

The approach relies on querying LLMs to parameterize agent actions via archetypes, drastically reducing the number of required queries at scale. However, the authors highlight that current practice lacks formal methods for prompt construction, and that data contamination in LLMs undermines reliability.

Establishing principled prompt-design methodologies would address a key bottleneck for deploying LLM-driven behaviors in ABMs, especially for prospective simulations where ground-truth data is unavailable and prompts must be generated from simulated trajectories.

References

First, real-world behaviors can be significantly more complex than what our prompt can capture and second, data contamination in LLMs remains an open challenge with no formal technique to design prompts for LLM queries.

On the limits of agency in agent-based models (2409.10568 - Chopra et al., 14 Sep 2024) in Section 5, Validating Archetypes