Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 99 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 36 tok/s
GPT-5 High 40 tok/s Pro
GPT-4o 99 tok/s
GPT OSS 120B 461 tok/s Pro
Kimi K2 191 tok/s Pro
2000 character limit reached

Procedural Adherence and Interpretability Through Neuro-Symbolic Generative Agents (2402.16905v2)

Published 24 Feb 2024 in cs.AI, cs.LG, and cs.LO

Abstract: The surge in popularity of LLMs has opened doors for new approaches to the creation of interactive agents. However, managing and interpreting the temporal behavior of such agents over the course of a potentially infinite interaction remain challenging. The stateful, long-term horizon reasoning required for coherent agent behavior does not fit well into the LLM paradigm. We propose a combination of formal logic-based program synthesis and LLM content generation to bring guarantees of procedural adherence and interpretability to generative agent behavior. To illustrate the benefit of procedural adherence and interpretability, we use Temporal Stream Logic (TSL) to generate an automaton that enforces an interpretable, high-level temporal structure on an agent. With the automaton tracking the context of the interaction and making decisions to guide the conversation accordingly, we can drive content generation in a way that allows the LLM to focus on a shorter context window. We evaluated our approach on different tasks involved in creating an interactive agent specialized for generating choose-your-own-adventure games. We found that over all of the tasks, an automaton-enhanced agent with procedural guarantees achieves at least 96% adherence to its temporal constraints, whereas a purely LLM-based agent demonstrates as low as 14.67% adherence.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. “Towards a human-like open-domain chatbot” In arXiv preprint arXiv:2001.09977, 2020
  2. “Giving BERT a calculator: Finding operations and arguments with reading comprehension” In arXiv preprint arXiv:1909.00109, 2019
  3. “Program synthesis with large language models” In arXiv preprint arXiv:2108.07732, 2021
  4. “From verification to causality-based explications” In arXiv preprint arXiv:2105.09533, 2021
  5. “Affective interaction: How emotional agents affect users” In International journal of human-computer studies 67.9 Elsevier, 2009, pp. 755–776
  6. “Using cognitive psychology to understand GPT-3” In Proceedings of the National Academy of Sciences 120.6 National Acad Sciences, 2023, pp. e2218523120
  7. “Consistent counterfactuals for deep models” In arXiv preprint arXiv:2110.03109, 2021
  8. “Language models are few-shot learners” In Advances in neural information processing systems 33, 2020, pp. 1877–1901
  9. “Make up your mind! adversarial generation of inconsistent natural language explanations” In arXiv preprint arXiv:1910.03065, 2019
  10. “Prompt Sapper: A LLM-Empowered Production Tool for Building AI Chains” In arXiv preprint arXiv:2306.12028, 2023
  11. “Pangu-coder: Program synthesis with function-level language modeling” In arXiv preprint arXiv:2207.11280, 2022
  12. Alonzo Church “Applications of recursive arithmetic to the problem of circuit synthesis” 1, Summaries of the Summer Institute of Symbolic Logic at Cornell University, 1957
  13. “Temporal causality in reactive systems” In International Symposium on Automated Technology for Verification and Analysis, 2022, pp. 208–224 Springer
  14. “An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents” In arXiv preprint arXiv:2309.05076, 2023
  15. “Plug and play language models: A simple approach to controlled text generation” In arXiv preprint arXiv:1912.02164, 2019
  16. Nicoletta De Francesco, Antonella Santone and Gigliola Vaglini “A user-friendly interface to specify temporal properties of concurrent systems” In Information Sciences 177.1 Elsevier, 2007, pp. 299–311
  17. “Chain-of-Verification Reduces Hallucination in Large Language Models”, 2023 arXiv:2309.11495 [cs.CL]
  18. “Improving Factuality and Reasoning in Language Models through Multiagent Debate” In arXiv preprint arXiv:2305.14325, 2023
  19. “Temporal stream logic: Synthesis beyond the bools” In International Conference on Computer Aided Verification, 2019 Springer
  20. Mor Geva, Ankit Gupta and Jonathan Berant “Injecting numerical reasoning skills into language models” In arXiv preprint arXiv:2004.04487, 2020
  21. Kiel Gilleade, Alan Dix and Jen Allanson “Affective videogames and modes of affective gaming: assist me, challenge me, emote me” In DiGRA 2005: Changing Views–Worlds in Play., 2005
  22. Ning Gu and Peiman Amini Behbahani “A critical review of computational creativity in built environment design” In Buildings 11.1 MDPI, 2021, pp. 29
  23. Oliver Hoffmann “On modeling human-computer co-creativity” In Knowledge, Information and Creativity Support Systems: Selected Papers from KICSS’2014-9th International Conference, held in Limassol, Cyprus, on November 6-8, 2014, 2016, pp. 37–48 Springer
  24. Edwin L Hutchins, James D Hollan and Donald A Norman “Direct manipulation interfaces” In Human–computer interaction 1.4 Taylor & Francis, 1985, pp. 311–338
  25. “Jigsaw: Large language models meet program synthesis” In Proceedings of the 44th International Conference on Software Engineering, 2022, pp. 1219–1231
  26. Dhanya Jayagopal, Justin Lubin and Sarah E Chasins “Exploring the learnability of program synthesizers by novice programmers” In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology, 2022, pp. 1–15
  27. “Ctrl: A conditional transformer language model for controllable generation” In arXiv preprint arXiv:1909.05858, 2019
  28. Gaetan Lopez Latouche, Laurence Marcotte and Ben Swanson “Generating Video Game Scripts with Style” In Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023), 2023, pp. 129–139
  29. Zachary C Lipton “The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery.” In Queue 16.3 ACM New York, NY, USA, 2018, pp. 31–57
  30. “Lang2ltl: Translating natural language commands to temporal specification with large language models” In Workshop on Language and Robotics at CoRL 2022, 2022
  31. “Reasoning on graphs: Faithful and interpretable large language model reasoning” In arXiv preprint arXiv:2310.01061, 2023
  32. Potsawee Manakul, Adian Liusie and Mark JF Gales “Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models” In arXiv preprint arXiv:2303.08896, 2023
  33. Angelos Mavrogiannis, Christoforos Mavrogiannis and Yiannis Aloimonos “Cook2LTL: Translating Cooking Recipes to LTL Formulae using Large Language Models” In arXiv preprint arXiv:2310.00163, 2023
  34. George H Mealy “A method for synthesizing sequential circuits” In The Bell System Technical Journal 34.5 Nokia Bell Labs, 1955, pp. 1045–1079
  35. “FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation” In arXiv preprint arXiv:2305.14251, 2023
  36. “Can LLMs Follow Simple Rules?” In arXiv preprint arXiv:2311.04235, 2023
  37. “Codegen: An open large language model for code with multi-turn program synthesis” In arXiv preprint arXiv:2203.13474, 2022
  38. OpenAI “OpenAI Documentation: Reproducable Outputs” https://platform.openai.com/docs/guides/text-generation/reproducible-outputs, 2023
  39. “LLM is Like a Box of Chocolates: the Non-determinism of ChatGPT in Code Generation” In arXiv preprint arXiv:2308.02828, 2023
  40. “Generative agents: Interactive simulacra of human behavior” In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, 2023, pp. 1–22
  41. Roma Patel, Roma Pavlick and Stefanie Tellex “Learning to ground language to temporal logical form” In Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2019
  42. Judea Pearl “Causes and Explanations: A Structural-Model Approach. Part Ii: Explanations” In British Journal for the Philosophy of Science 56.4 University of Chicago Press, 2005, pp. 889–911 DOI: 10.1093/bjps/axi148
  43. Piotr Piekos, Henryk Michalewski and Mateusz Malinowski “Measuring and Improving BERT’s Mathematical Abilities by Predicting the Order of Reasoning” In arXiv preprint arXiv:2106.03921, 2021
  44. Jonathan A Plucker, Ronald A Beghetto and Gayle T Dow “Why isn’t creativity more important to educational psychologists? Potentials, pitfalls, and future directions in creativity research” In Educational psychologist 39.2 Taylor & Francis, 2004, pp. 83–96
  45. Amir Pnueli “The temporal logic of programs” In 18th Annual Symposium on Foundations of Computer Science (sfcs 1977), 1977, pp. 46–57 ieee
  46. “Measuring and narrowing the compositionality gap in language models” In arXiv preprint arXiv:2210.03350, 2022
  47. “Towards the Usability of Reactive Synthesis: Building Blocks of Temporal Logic”, 2023 Plateau Workshop
  48. “Counterfactual explanations and algorithmic recourses for machine learning: A review” In arXiv preprint arXiv:2010.10596, 2020
  49. “Learning a natural-language to LTL executable semantic parser for grounded robotics” In Proceedings of the 2020 Conference on Robot Learning 155, Proceedings of Machine Learning Research PMLR, 2021, pp. 1706–1718 URL: https://proceedings.mlr.press/v155/wang21g.html
  50. “Chain-of-thought prompting elicits reasoning in large language models” In Advances in Neural Information Processing Systems 35, 2022, pp. 24824–24837
  51. Seiji Yamashita, Masateru Tsunoda and Tomoyuki Yokogawa “Visual programming language for model checkers based on google blockly” In Product-Focused Software Process Improvement: 18th International Conference, PROFES 2017, Innsbruck, Austria, November 29–December 1, 2017, Proceedings 18, 2017, pp. 597–601 Springer
  52. “Three questions concerning the use of large language models to facilitate mathematics learning” In arXiv preprint arXiv:2310.13615, 2023
Citations (2)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com