Procedural Adherence and Interpretability Through Neuro-Symbolic Generative Agents (2402.16905v2)
Abstract: The surge in popularity of LLMs has opened doors for new approaches to the creation of interactive agents. However, managing and interpreting the temporal behavior of such agents over the course of a potentially infinite interaction remain challenging. The stateful, long-term horizon reasoning required for coherent agent behavior does not fit well into the LLM paradigm. We propose a combination of formal logic-based program synthesis and LLM content generation to bring guarantees of procedural adherence and interpretability to generative agent behavior. To illustrate the benefit of procedural adherence and interpretability, we use Temporal Stream Logic (TSL) to generate an automaton that enforces an interpretable, high-level temporal structure on an agent. With the automaton tracking the context of the interaction and making decisions to guide the conversation accordingly, we can drive content generation in a way that allows the LLM to focus on a shorter context window. We evaluated our approach on different tasks involved in creating an interactive agent specialized for generating choose-your-own-adventure games. We found that over all of the tasks, an automaton-enhanced agent with procedural guarantees achieves at least 96% adherence to its temporal constraints, whereas a purely LLM-based agent demonstrates as low as 14.67% adherence.
- “Towards a human-like open-domain chatbot” In arXiv preprint arXiv:2001.09977, 2020
- “Giving BERT a calculator: Finding operations and arguments with reading comprehension” In arXiv preprint arXiv:1909.00109, 2019
- “Program synthesis with large language models” In arXiv preprint arXiv:2108.07732, 2021
- “From verification to causality-based explications” In arXiv preprint arXiv:2105.09533, 2021
- “Affective interaction: How emotional agents affect users” In International journal of human-computer studies 67.9 Elsevier, 2009, pp. 755–776
- “Using cognitive psychology to understand GPT-3” In Proceedings of the National Academy of Sciences 120.6 National Acad Sciences, 2023, pp. e2218523120
- “Consistent counterfactuals for deep models” In arXiv preprint arXiv:2110.03109, 2021
- “Language models are few-shot learners” In Advances in neural information processing systems 33, 2020, pp. 1877–1901
- “Make up your mind! adversarial generation of inconsistent natural language explanations” In arXiv preprint arXiv:1910.03065, 2019
- “Prompt Sapper: A LLM-Empowered Production Tool for Building AI Chains” In arXiv preprint arXiv:2306.12028, 2023
- “Pangu-coder: Program synthesis with function-level language modeling” In arXiv preprint arXiv:2207.11280, 2022
- Alonzo Church “Applications of recursive arithmetic to the problem of circuit synthesis” 1, Summaries of the Summer Institute of Symbolic Logic at Cornell University, 1957
- “Temporal causality in reactive systems” In International Symposium on Automated Technology for Verification and Analysis, 2022, pp. 208–224 Springer
- “An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents” In arXiv preprint arXiv:2309.05076, 2023
- “Plug and play language models: A simple approach to controlled text generation” In arXiv preprint arXiv:1912.02164, 2019
- Nicoletta De Francesco, Antonella Santone and Gigliola Vaglini “A user-friendly interface to specify temporal properties of concurrent systems” In Information Sciences 177.1 Elsevier, 2007, pp. 299–311
- “Chain-of-Verification Reduces Hallucination in Large Language Models”, 2023 arXiv:2309.11495 [cs.CL]
- “Improving Factuality and Reasoning in Language Models through Multiagent Debate” In arXiv preprint arXiv:2305.14325, 2023
- “Temporal stream logic: Synthesis beyond the bools” In International Conference on Computer Aided Verification, 2019 Springer
- Mor Geva, Ankit Gupta and Jonathan Berant “Injecting numerical reasoning skills into language models” In arXiv preprint arXiv:2004.04487, 2020
- Kiel Gilleade, Alan Dix and Jen Allanson “Affective videogames and modes of affective gaming: assist me, challenge me, emote me” In DiGRA 2005: Changing Views–Worlds in Play., 2005
- Ning Gu and Peiman Amini Behbahani “A critical review of computational creativity in built environment design” In Buildings 11.1 MDPI, 2021, pp. 29
- Oliver Hoffmann “On modeling human-computer co-creativity” In Knowledge, Information and Creativity Support Systems: Selected Papers from KICSS’2014-9th International Conference, held in Limassol, Cyprus, on November 6-8, 2014, 2016, pp. 37–48 Springer
- Edwin L Hutchins, James D Hollan and Donald A Norman “Direct manipulation interfaces” In Human–computer interaction 1.4 Taylor & Francis, 1985, pp. 311–338
- “Jigsaw: Large language models meet program synthesis” In Proceedings of the 44th International Conference on Software Engineering, 2022, pp. 1219–1231
- Dhanya Jayagopal, Justin Lubin and Sarah E Chasins “Exploring the learnability of program synthesizers by novice programmers” In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology, 2022, pp. 1–15
- “Ctrl: A conditional transformer language model for controllable generation” In arXiv preprint arXiv:1909.05858, 2019
- Gaetan Lopez Latouche, Laurence Marcotte and Ben Swanson “Generating Video Game Scripts with Style” In Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023), 2023, pp. 129–139
- Zachary C Lipton “The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery.” In Queue 16.3 ACM New York, NY, USA, 2018, pp. 31–57
- “Lang2ltl: Translating natural language commands to temporal specification with large language models” In Workshop on Language and Robotics at CoRL 2022, 2022
- “Reasoning on graphs: Faithful and interpretable large language model reasoning” In arXiv preprint arXiv:2310.01061, 2023
- Potsawee Manakul, Adian Liusie and Mark JF Gales “Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models” In arXiv preprint arXiv:2303.08896, 2023
- Angelos Mavrogiannis, Christoforos Mavrogiannis and Yiannis Aloimonos “Cook2LTL: Translating Cooking Recipes to LTL Formulae using Large Language Models” In arXiv preprint arXiv:2310.00163, 2023
- George H Mealy “A method for synthesizing sequential circuits” In The Bell System Technical Journal 34.5 Nokia Bell Labs, 1955, pp. 1045–1079
- “FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation” In arXiv preprint arXiv:2305.14251, 2023
- “Can LLMs Follow Simple Rules?” In arXiv preprint arXiv:2311.04235, 2023
- “Codegen: An open large language model for code with multi-turn program synthesis” In arXiv preprint arXiv:2203.13474, 2022
- OpenAI “OpenAI Documentation: Reproducable Outputs” https://platform.openai.com/docs/guides/text-generation/reproducible-outputs, 2023
- “LLM is Like a Box of Chocolates: the Non-determinism of ChatGPT in Code Generation” In arXiv preprint arXiv:2308.02828, 2023
- “Generative agents: Interactive simulacra of human behavior” In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, 2023, pp. 1–22
- Roma Patel, Roma Pavlick and Stefanie Tellex “Learning to ground language to temporal logical form” In Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2019
- Judea Pearl “Causes and Explanations: A Structural-Model Approach. Part Ii: Explanations” In British Journal for the Philosophy of Science 56.4 University of Chicago Press, 2005, pp. 889–911 DOI: 10.1093/bjps/axi148
- Piotr Piekos, Henryk Michalewski and Mateusz Malinowski “Measuring and Improving BERT’s Mathematical Abilities by Predicting the Order of Reasoning” In arXiv preprint arXiv:2106.03921, 2021
- Jonathan A Plucker, Ronald A Beghetto and Gayle T Dow “Why isn’t creativity more important to educational psychologists? Potentials, pitfalls, and future directions in creativity research” In Educational psychologist 39.2 Taylor & Francis, 2004, pp. 83–96
- Amir Pnueli “The temporal logic of programs” In 18th Annual Symposium on Foundations of Computer Science (sfcs 1977), 1977, pp. 46–57 ieee
- “Measuring and narrowing the compositionality gap in language models” In arXiv preprint arXiv:2210.03350, 2022
- “Towards the Usability of Reactive Synthesis: Building Blocks of Temporal Logic”, 2023 Plateau Workshop
- “Counterfactual explanations and algorithmic recourses for machine learning: A review” In arXiv preprint arXiv:2010.10596, 2020
- “Learning a natural-language to LTL executable semantic parser for grounded robotics” In Proceedings of the 2020 Conference on Robot Learning 155, Proceedings of Machine Learning Research PMLR, 2021, pp. 1706–1718 URL: https://proceedings.mlr.press/v155/wang21g.html
- “Chain-of-thought prompting elicits reasoning in large language models” In Advances in Neural Information Processing Systems 35, 2022, pp. 24824–24837
- Seiji Yamashita, Masateru Tsunoda and Tomoyuki Yokogawa “Visual programming language for model checkers based on google blockly” In Product-Focused Software Process Improvement: 18th International Conference, PROFES 2017, Innsbruck, Austria, November 29–December 1, 2017, Proceedings 18, 2017, pp. 597–601 Springer
- “Three questions concerning the use of large language models to facilitate mathematics learning” In arXiv preprint arXiv:2310.13615, 2023
Collections
Sign up for free to add this paper to one or more collections.