Enable memory for LM agents in Machiavelli to support long-horizon coherence

Develop a method that allows the language-model agents used in the Machiavelli benchmark to retain and utilize a memory of previous events beyond the current scene despite context window limitations, in order to support longer-term planning and coherence during gameplay.

Background

In Machiavelli, the authors prompt language-model agents with the current scene and available actions. Due to context window limits, the agents do not have access to earlier events in the trajectory, which the authors note is important for long-term planning and coherence.

They explicitly state that addressing memory limitations is left for future work, highlighting the unresolved need for mechanisms that allow models to maintain and leverage historical context across long game trajectories.

References

Due to limitations on context window length, our current prompting scheme only shows the LM the current scene text and does not provide a way for models to retain a memory of previous events in the game. We expect this to be important for longer-term planning and coherence, and leave this for future work to address.

Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark  (2304.03279 - Pan et al., 2023) in Appendix: Prompts for Language Model Agents