Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations (2403.12451v4)

Published 19 Mar 2024 in cs.AI

Abstract: Neuro-symbolic reinforcement learning (NS-RL) has emerged as a promising paradigm for explainable decision-making, characterized by the interpretability of symbolic policies. NS-RL entails structured state representations for tasks with visual observations, but previous methods cannot refine the structured states with rewards due to a lack of efficiency. Accessibility also remains an issue, as extensive domain knowledge is required to interpret symbolic policies. In this paper, we present a neuro-symbolic framework for jointly learning structured states and symbolic policies, whose key idea is to distill the vision foundation model into an efficient perception module and refine it during policy learning. Moreover, we design a pipeline to prompt GPT-4 to generate textual explanations for the learned policies and decisions, significantly reducing users' cognitive load to understand the symbolic policies. We verify the efficacy of our approach on nine Atari tasks and present GPT-generated explanations for policies and decisions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Deep reinforcement learning at the edge of the statistical precipice. Advances in neural information processing systems, 34:29304–29320, 2021.
  2. Language models can explain neurons in language models. URL https://openaipublic. blob. core. windows. net/neuron-explainer/paper/index. html.(Date accessed: 14.05. 2023), 2023.
  3. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712, 2023.
  4. Distilling deep reinforcement learning policies in soft decision trees. In Proceedings of the IJCAI 2019 workshop on explainable artificial intelligence, pp.  1–6, 2019.
  5. Explainable reinforcement learning for broad-xai: a conceptual framework and survey. Neural Computing and Applications, pp.  1–24, 2023.
  6. Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897):414–419, 2022.
  7. Interpretable and explainable logical policies via neurally guided symbolic abstraction. arXiv preprint arXiv:2306.01439, 2023a.
  8. Boosting object representation learning via motion and object continuity. In Machine Learning and Knowledge Discovery in Databases: Research Track, pp.  610–628. Springer Nature, 2023b.
  9. Rationalization: A neural machine translation approach to generating natural language explanations. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp.  81–87, 2018.
  10. Visualizing and understanding atari agents. In International conference on machine learning, pp.  1792–1801. PMLR, 2018.
  11. Improving robot controller transparency through autonomous policy explanation. In Proceedings of the 2017 ACM/IEEE international conference on human-robot interaction, pp.  303–312, 2017.
  12. Cleanrl: High-quality single-file implementations of deep reinforcement learning algorithms. The Journal of Machine Learning Research, 23(1):12585–12602, 2022.
  13. Improving object-centric learning with query optimization. In The Eleventh International Conference on Learning Representations, 2023.
  14. Are large language models post hoc explainers? arXiv preprint arXiv:2310.05797, 2023.
  15. Discovering symbolic policies with deep reinforcement learning. In International Conference on Machine Learning, pp.  5979–5989. PMLR, 2021.
  16. Focal loss for dense object detection. In 2017 IEEE International Conference on Computer Vision (ICCV), pp.  2999–3007, 2017.
  17. Space: Unsupervised object-oriented scene representation via spatial attention and decomposition. In International Conference on Learning Representations, 2020.
  18. Sdrl: interpretable and data-efficient deep reinforcement learning leveraging symbolic planning. In Proceedings of the AAAI Conference on Artificial Intelligence, pp.  2970–2977, 2019.
  19. Extrapolation and learning equations. arXiv preprint arXiv:1610.02995, 2016.
  20. A survey of explainable reinforcement learning. arXiv preprint arXiv:2202.08434, 2022.
  21. Look wide and interpret twice: Improving performance on interactive instruction-following tasks. arXiv preprint arXiv:2106.00596, 2021.
  22. Programmatic reinforcement learning without oracles. In International Conference on Learning Representations, 2022.
  23. Learning equations for extrapolation and control. In International Conference on Machine Learning, pp.  4442–4450. PMLR, 2018.
  24. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  25. Explaining black box text modules in natural language with language models. arXiv preprint arXiv:2305.09863, 2023.
  26. Demystifying embedding spaces using large language models. arXiv preprint arXiv:2310.04475, 2023.
  27. Iterative bounding mdps: Learning interpretable policies via non-interpretable methods. In Proceedings of the AAAI Conference on Artificial Intelligence, pp.  9923–9931, 2021.
  28. Programmatically interpretable reinforcement learning. In International Conference on Machine Learning, pp.  5045–5054. PMLR, 2018.
  29. Imitation-projected programmatic reinforcement learning. Advances in Neural Information Processing Systems, 32, 2019.
  30. Verbal explanations for deep reinforcement learning neural networks with attention on extracted features. In 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), pp.  1–7. IEEE, 2019.
  31. Evolving simple programs for playing atari games. In Proceedings of the genetic and evolutionary computation conference, pp.  229–236, 2018.
  32. Distribution-balanced loss for multi-label classification in long-tailed datasets. In European Conference on Computer Vision (ECCV), 2020.
  33. Outracing champion gran turismo drivers with deep reinforcement learning. Nature, 602(7896):223–228, 2022.
  34. Decoupling features in hierarchical propagation for video object segmentation. Advances in Neural Information Processing Systems, 35:36324–36336, 2022.
  35. An investigation into pre-training object-centric representations for reinforcement learning. arXiv preprint arXiv:2302.04419, 2023.
  36. Pre-trained image encoder for generalizable visual reinforcement learning. Advances in Neural Information Processing Systems, 35:13022–13037, 2022.
  37. Off-policy differentiable logic reinforcement learning. In Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part II 21, pp.  617–632. Springer, 2021.
  38. Explaining agent behavior with large language models. arXiv preprint arXiv:2309.10346, 2023.
  39. Fast segment anything. arXiv preprint arXiv:2306.12156, 2023.
  40. Symbolic visual reinforcement learning: A scalable framework with object-level abstraction and differentiable expression search. arXiv preprint arXiv:2212.14849, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Lirui Luo (1 paper)
  2. Guoxi Zhang (9 papers)
  3. Hongming Xu (10 papers)
  4. Yaodong Yang (169 papers)
  5. Cong Fang (36 papers)
  6. Qing Li (430 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.