ReGAL: Refactoring Programs to Discover Generalizable Abstractions
Abstract: While LLMs are increasingly being used for program synthesis, they lack the global view needed to develop useful abstractions; they generally predict programs one at a time, often repeating the same functionality. Generating redundant code from scratch is both inefficient and error-prone. To address this, we propose Refactoring for Generalizable Abstraction Learning (ReGAL), a gradient-free method for learning a library of reusable functions via code refactorization, i.e., restructuring code without changing its execution output. ReGAL learns from a small set of existing programs, iteratively verifying and refining its abstractions via execution. We find that the shared function libraries discovered by ReGAL make programs easier to predict across diverse domains. On five datasets -- LOGO graphics generation, Date reasoning, TextCraft (a Minecraft-based text-game) MATH, and TabMWP -- both open-source and proprietary LLMs improve in accuracy when predicting programs with ReGAL functions. For CodeLlama-13B, ReGAL results in absolute accuracy increases of 11.5% on LOGO, 26.1% on date understanding, and 8.1% on TextCraft, outperforming GPT-3.5 in two of three domains. Our analysis reveals ReGAL's abstractions encapsulate frequently-used subroutines as well as environment dynamics.
- Turtle geometry: The computer as a medium for exploring mathematics. MIT press, 1986.
- Do as i can and not as i say: Grounding language in robotic affordances. In arXiv preprint arXiv:2204.01691, 2022.
- Neural module networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 39–48, 2016.
- Recent advances in hierarchical reinforcement learning. Discrete event dynamic systems, 13(1-2):41–77, 2003.
- Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pp. 41–48, 2009.
- Leveraging code to improve in-context learning for semantic parsing. arXiv preprint arXiv:2311.09519, 2023.
- Top-down synthesis for library learning. Proceedings of the ACM on Programming Languages, 7(POPL):1182–1213, 2023.
- Large language models as tool makers. arXiv preprint arXiv:2305.17126, 2023.
- Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588, 2022.
- Visual programming for text-to-image generation and evaluation. Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023.
- Textworld: A learning environment for text-based games. In Computer Games: 7th Workshop, CGW 2018, Held in Conjunction with the 27th International Conference on Artificial Intelligence, IJCAI 2018, Stockholm, Sweden, July 13, 2018, Revised Selected Papers 7, pp. 41–75. Springer, 2019.
- Downey, A. Think python. ” O’Reilly Media, Inc.”, 2012.
- Faith and fate: Limits of transformers on compositionality. arXiv preprint arXiv:2305.18654, 2023.
- Dreamcoder: Bootstrapping inductive program synthesis with wake-sleep library learning. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pp. 835–850, 2021.
- Applications and implementation: an experimental comparison of cellular (group technology) layout with process layout. Decision Sciences, 18(4):562–581, 1987.
- Foundations of rule learning. Springer Science & Business Media, 2012.
- The evolution of ecological specialization. Annual review of Ecology and Systematics, 19(1):207–233, 1988.
- Learning interpretable libraries by compressing and documenting code. In Intrinsically-Motivated and Open-Ended Learning Workshop@ NeurIPS2023, 2023.
- Semantic parsing for task oriented dialog using hierarchical representations. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2787–2792, 2018.
- Visual programming: Compositional visual reasoning without training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14953–14962, 2023.
- Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. arXiv preprint arXiv:2201.07207, 2022.
- Voxposer: Composable 3d value maps for robotic manipulation with language models. In Conference on Robot Learning, pp. 540–562. PMLR, 2023.
- Decomposed prompting: A modular approach for solving complex tasks. In The Eleventh International Conference on Learning Representations, 2023.
- Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338, 2015.
- What makes good in-context examples for GPT-3? In Agirre, E., Apidianaki, M., and Vulić, I. (eds.), Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pp. 100–114, Dublin, Ireland and Online, May 2022. Association for Computational Linguistics.
- Agentbench: Evaluating llms as agents. arXiv preprint arXiv:2308.03688, 2023.
- Chameleon: Plug-and-play compositional reasoning with large language models. arXiv preprint arXiv:2304.09842, 2023.
- Faithful chain-of-thought reasoning. arXiv preprint arXiv:2301.13379, 2023.
- Clin: A continually learning language agent for rapid task adaptation and generalization. arXiv preprint arXiv:2310.10134, 2023.
- Rule learning by seven-month-old infants. Science, 283(5398):77–80, 1999.
- McConnell, S. Code complete. Pearson Education, 2004.
- Augmented language models: a survey. arXiv preprint arXiv:2302.07842, 2023.
- O’Donnell, T. J. Productivity and reuse in language: A theory of linguistic computation and storage. MIT Press, 2015.
- OpenAI. New and improved embedding model, 2022. URL https://openai.com/blog/new-and-improved-embedding-model.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- Adapt: As-needed decomposition and planning with language models. arXiv preprint arXiv:2311.05772, 2023.
- Creator: Tool creation for disentangling abstract and concrete reasoning of large language models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 6922–6939, 2023.
- Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789, 2023.
- Benchclamp: A benchmark for evaluating language models on semantic parsing. arXiv preprint arXiv:2206.10668, 2022.
- Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950, 2023.
- Identifying the risks of LM agents with an LM-emulated sandbox. In NeurIPS 2023 Foundation Models for Decision Making Workshop, 2023.
- Summarization programs: Interpretable abstractive summarization with neural modular trees. In The Eleventh International Conference on Learning Representations, 2022.
- Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761, 2023.
- Few-shot semantic parsing with language models trained on code. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5417–5425, 2022.
- ProgPrompt: Generating situated robot task plans using Large Language Models. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 11523–11530. IEEE, 2023.
- Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv preprint arXiv:2206.04615, 2022.
- ViperGPT: Visual inference via Python execution for reasoning. arXiv preprint arXiv:2303.08128, 2023.
- Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1-2):181–211, 1999.
- Challenging big-bench tasks and whether chain-of-thought can solve them. arXiv preprint arXiv:2210.09261, 2022.
- Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291, 2023.
- Ward Jr, J. H. Hierarchical grouping to optimize an objective function. Journal of the American statistical association, 58(301):236–244, 1963.
- Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359, 2021.
- Winograd, T. Understanding natural language. Cognitive psychology, 3(1):1–191, 1972.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45, Online, October 2020. Association for Computational Linguistics.
- Leveraging language to learn program abstractions and search heuristics. In International Conference on Machine Learning, pp. 11193–11204. PMLR, 2021.
- Learning adaptive planning representations with natural language guidance. arXiv preprint arXiv:2312.08566, 2023.
- Yang, C. The price of linguistic productivity: How children learn to break the rules of language. MIT press, 2016.
- React: Synergizing reasoning and acting in language models. In The Eleventh International Conference on Learning Representations, 2023.
- Large language models as analogical reasoners. arXiv preprint arXiv:2310.01714, 2023.
- Craft: Customizing llms by creating and retrieving from specialized toolsets. arXiv preprint arXiv:2309.17428, 2023.
- Learning to parse database queries using inductive logic programming. In Proceedings of the national conference on artificial intelligence, pp. 1050–1055, 1996.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.