Enable memory for LM agents in Machiavelli to support long-horizon coherence
Develop a method that allows the language-model agents used in the Machiavelli benchmark to retain and utilize a memory of previous events beyond the current scene despite context window limitations, in order to support longer-term planning and coherence during gameplay.
References
Due to limitations on context window length, our current prompting scheme only shows the LM the current scene text and does not provide a way for models to retain a memory of previous events in the game. We expect this to be important for longer-term planning and coherence, and leave this for future work to address.
— Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
(2304.03279 - Pan et al., 2023) in Appendix: Prompts for Language Model Agents