Prompt dilution conjecture for security reminders in code agents
Determine whether, in large language model–powered code agents, the accumulation of system prompts and the growth of multi‑turn conversation histories reduce the agent’s ability to attend to and act on a single‑sentence security reminder within the user’s requirements, thereby explaining the observed lack of improvement in secure code generation when such explicit reminders are added to the prompt.
References
Our experiments show that an explicit security reminder does not lead to more secure code. We conjecture that, as system prompts (e.g., role declarations, tool usage descriptions, task specifications) accumulate and multi-turn conversation histories grow longer, it becomes difficult for agents to attend to a single sentence in the userâs requirements.