Prompt dilution conjecture for security reminders in code agents

Determine whether, in large language model–powered code agents, the accumulation of system prompts and the growth of multi‑turn conversation histories reduce the agent’s ability to attend to and act on a single‑sentence security reminder within the user’s requirements, thereby explaining the observed lack of improvement in secure code generation when such explicit reminders are added to the prompt.

Background

SecureAgentBench evaluates secure code generation by repository-level tasks with both functional tests and security checks. The authors tested adding an explicit security reminder to the prompt but observed no increase in correct-and-secure solutions; instead, invalid outputs increased due to time/cost limits and cautious behaviors.

To explain this negative result, the authors explicitly conjecture that the accumulation of system prompts and long multi-turn dialogue histories make it difficult for agents to attend to a single sentence in the user’s requirements, suggesting the need for more holistic prompting strategies spanning the agent workflow.

References

Our experiments show that an explicit security reminder does not lead to more secure code. We conjecture that, as system prompts (e.g., role declarations, tool usage descriptions, task specifications) accumulate and multi-turn conversation histories grow longer, it becomes difficult for agents to attend to a single sentence in the userâs requirements.

— SecureAgentBench: Benchmarking Secure Code Generation under Realistic Vulnerability Scenarios (2509.22097 - Chen et al., 26 Sep 2025) in Appendix, Section: Discussion (Implications)

Prompt dilution conjecture for security reminders in code agents

Sponsor

Background

References

Related Problems