Textbooks Are All You Need II: phi-1.5 technical report (2309.05463v1)

Published 11 Sep 2023 in cs.CL and cs.AI

Abstract: We continue the investigation into the power of smaller Transformer-based LLMs as initiated by \textbf{TinyStories} -- a 10 million parameter model that can produce coherent English -- and the follow-up work on \textbf{phi-1}, a 1.3 billion parameter model with Python coding performance close to the state-of-the-art. The latter work proposed to use existing LLMs to generate textbook quality" data as a way to enhance the learning process compared to traditional web data. We follow theTextbooks Are All You Need" approach, focusing this time on common sense reasoning in natural language, and create a new 1.3 billion parameter model named \textbf{phi-1.5}, with performance on natural language tasks comparable to models 5x larger, and surpassing most non-frontier LLMs on more complex reasoning tasks such as grade-school mathematics and basic coding. More generally, \textbf{phi-1.5} exhibits many of the traits of much larger LLMs, both good -- such as the ability to ``think step by step" or perform some rudimentary in-context learning -- and bad, including hallucinations and the potential for toxic and biased generations -- encouragingly though, we are seeing improvement on that front thanks to the absence of web data. We open-source \textbf{phi-1.5} to promote further research on these urgent topics.

PDF Abstract

Overview of phi-1.5

The intriguing development in the field of LLMs calls for attention with the creation of a new model dubbed phi-1.5, boasting 1.3 billion parameters. This model builds upon the idea that high-quality synthetic training data can produce language understanding and reasoning capabilities comparable to larger models with a fraction of the computational footprint.

Performance and Benchmarks

Phi-1.5 is calibrated to excel in common sense reasoning and basic coding, engaging in tasks usually reserved for its larger counterparts. Benchmarked against larger models—some with up to 13 billion parameters—it demonstrates a startling competence, particularly in multi-step reasoning problems. Importantly, the model's reliance on synthetic data—absent of web content—seems to also reduce the generation of toxic and biased outputs, an issue plaguing many contemporary models.

Training Methodology

The team behind phi-1.5 designed an elaborate process involving the careful selection of seed topics, iterative fine-tuning, and strategic topic expansion, revealing that data quality may be as crucial as data quantity. Remarkably, the resulting synthetic dataset which forms the core of phi-1.5's training material is nearly ten times smaller than those used for state-of-the-art models of similar caliber, suggesting efficient learning mechanisms at play.

Implications of phi-1.5

Phi-1.5's open-source availability marks a step towards democratizing AI research. While still lagging behind the most extensive LLMs, it shows off traits once exclusive to those behemoths, inviting broader experimentation and investigation. The model may also pave the way for more energy-efficient and globally accessible AI solutions, challenging the industry norm that larger and computationally intensive models are a necessity for advanced AI capabilities.

Confronting AI Shortcomings

Notably, phi-1.5 does not fully eschew the generation of problematic content. However, it does show promise in managing these risks better than similar-sized models trained solely on web data. The research team presents phi-1.5 as a testbed for methodologies aimed at mitigating ethical AI issues, employing a synthetic training regimen that could herald a new direction for responsible AI development. As the quest for AI models that balance environmental sustainability, ethical soundness, and cognitive prowess continues, phi-1.5 emerges as a promising harbinger of a more balanced approach to AI scalability and sophistication.