LILO: Learning Interpretable Libraries by Compressing and Documenting Code (2310.19791v4)

Published 30 Oct 2023 in cs.CL, cs.AI, cs.LG, and cs.PL

Abstract: While LLMs now excel at code generation, a key aspect of software development is the art of refactoring: consolidating code into libraries of reusable and readable programs. In this paper, we introduce LILO, a neurosymbolic framework that iteratively synthesizes, compresses, and documents code to build libraries tailored to particular problem domains. LILO combines LLM-guided program synthesis with recent algorithmic advances in automated refactoring from Stitch: a symbolic compression system that efficiently identifies optimal lambda abstractions across large code corpora. To make these abstractions interpretable, we introduce an auto-documentation (AutoDoc) procedure that infers natural language names and docstrings based on contextual examples of usage. In addition to improving human readability, we find that AutoDoc boosts performance by helping LILO's synthesizer to interpret and deploy learned abstractions. We evaluate LILO on three inductive program synthesis benchmarks for string editing, scene reasoning, and graphics composition. Compared to existing neural and symbolic methods - including the state-of-the-art library learning algorithm DreamCoder - LILO solves more complex tasks and learns richer libraries that are grounded in linguistic knowledge.

PDF Abstract

Overview of Lilo: Learning Interpretable Libraries by Compressing and Documenting Code

The paper presents a framework named Lilo, a neurosymbolic approach aimed at enhancing the abilities of LLMs in code generation by focusing on the crucial task of code refactoring. Refactoring involves the synthesis, compression, and documentation of code to create reusable and readable libraries tailored to specific problem domains. This endeavor addresses a gap in traditional program synthesis by not only generating solutions but also developing interpretable abstractions that can facilitate broader applicability.

Framework and Methodology

Lilo consists of three interconnected modules that form a loop of synthesis, compression, and documentation:

Dual-System Synthesis: This module employs a dual-system strategy combining LLM-guided searches with enumerative search. The LLM is tasked with leveraging powerful, pre-trained domain-general priors, while the enumerative search focuses on discovering domain-specific expressions.
Compression via Stitch: A key component of Lilo is the use of the Stitch compression system. Stitch efficiently identifies reusable abstractions across large code corpora using branch-and-bound search. This optimizes the process of library learning by removing redundant structures and facilitating efficient rewriting.
Auto-Documentation (AutoDoc): AutoDoc enhances the interpretability of synthesized code by generating human-readable names and docstrings for the identified abstractions. This step not only makes the libraries more accessible to human developers but also improves the LLM's ability to utilize these abstractions effectively.

Results and Evaluation

Lilo was evaluated against three inductive program synthesis benchmarks: string editing, scene reasoning, and graphics composition. The results demonstrate that Lilo is capable of solving more complex tasks and learning richer, linguistically grounded libraries than state-of-the-art methods like DreamCoder. For instance, in the string editing domain, Lilo was able to abstract the concept of vowels, significantly reducing the search space required to solve associated tasks.

Implications and Future Work

The framework showcases the potential of integrating PL techniques with recent advances in LLMs. By leveraging neurosymbolic architectures, Lilo offers a promising direction for creating interoperable and interpretable code libraries. As AI continues to converge with traditional programming paradigms, further developments could see Lilo applied to more diverse programming languages and problem domains, potentially leading to advancements in areas like automated code maintenance and adaptive programming environments.

Future research can explore the integration of retrieval and self-reflection techniques to further enhance the capabilities of Lilo, enabling it to operate in even more dynamic and complex software environments. Moreover, bridging the gap between imperative and functional programming within this framework can open new avenues for the synthesis of code across modern languages.

In conclusion, Lilo represents a significant step toward building autonomous systems capable of generating, refactoring, and interpreting code, thereby aligning with the long-standing goals of creating adaptive, scalable, and maintainable software architectures.