Conjecture on meta-prompt-induced creativity and factuality drift

Determine whether meta-prompts that instruct a large language model to simultaneously play the roles of both a user and an intelligent AI agent—thereby generating a conversation between these two simulated roles within the same context—instigate increased creativity and motivate continued token generation that ignores factuality, leading to hallucinated code outputs.

Background

The paper introduces qcr, a framework that exploits interactive prompting, meta-prompts, and reward-based mechanisms to trigger code hallucinations in black-box LLMs. In the meta-prompt setup, the model is asked to act as both a user and an AI agent, generating an internal dialogue that replicates its own generative process within both entities.

The authors explicitly conjecture that this meta-prompt configuration encourages the models to prioritize creativity and produce increasingly novel tokens at the expense of factuality, thereby exacerbating hallucinated behavior in code generation.

References

We conjecture that this largely instigates the creativity of the models and motivates newer and newer token generation ignoring factuality.

— Code Hallucination (2407.04831 - Rahman et al., 5 Jul 2024) in Section 4 (qcr)

Conjecture on meta-prompt-induced creativity and factuality drift

Background

References

Related Problems