Lemur: Harmonizing Natural Language and Code for Language Agents
The paper "Lemur: Harmonizing Natural Language and Code for Language Agents" introduces Lemur and Lemur-Chat, openly accessible LLMs designed to excel in both natural language and coding tasks. This dual capability addresses the growing necessity for language agents to not only handle human interaction but also interact with environments through coding.
Overview
Lemur and Lemur-Chat stand out by merging the abilities to process natural language and write code, aiming to serve as foundational models for versatile language agents. Traditional models typically prioritize one domain over the other, leading to a performance divide. In contrast, Lemur models achieve a balance, performing competitively across multiple benchmarks.
Methodology
The models are built on the Llama-2-70B architecture, enhanced through pre-training and instruction fine-tuning. The pre-training involved a specially curated corpus with a 10:1 text-to-code ratio, using data from sources like The Stack and RefinedWeb. This stage aimed to bolster coding skills while maintaining natural language proficiency. Following pre-training, instruction fine-tuning was performed with 300K examples from various domains, further refining the models' capabilities.
Experimental Results
The paper presents comprehensive evaluations on diverse benchmarks:
- Natural Language Processing: MMLU, BBH, and GSM8K were utilized to assess language understanding and reasoning. Lemur models demonstrated competitive results, outperforming Llama variants.
- Coding: HumanEval, MBPP, and others served to test programming competencies. Here, Lemur models showed enhanced performance over other open-source models like CodeLlama.
- Agent Tasks: Evaluations included tool usage, feedback incorporation, and environment exploration. Lemur-Chat excelled in interactive scenarios surpassing existing models.
In tool-augmented reasoning, such as math problems requiring Python, Lemur-Chat significantly outperformed others, highlighting the importance of harmonized capabilities.
Implications and Future Directions
This research underlines the potential of models adept in both text and coding as a basis for sophisticated language agents. Harmonized models like Lemur can bridge the performance gap between open-source and proprietary systems, facilitating the development of advanced applications in multi-agent systems and autonomous task-solving.
Looking forward, the findings advocate for further research in optimizing the synergy between natural language and programming skills. Possible directions include refining intermediate representations to enhance performance and tackling more complex tasks in partially observable environments.
Lemur's contribution represents a step forward in developing versatile and adaptive AI systems, offering valuable insight into the future of integrated language and coding models.