Bootstrap Your Own Context Length (2412.18860v2)
Abstract: We introduce a bootstrapping approach to train long-context LLMs by exploiting their short-context capabilities only. Our method utilizes a simple agent workflow to synthesize diverse long-context instruction tuning data, thereby eliminating the necessity for manual data collection and annotation. The proposed data synthesis workflow requires only a short-context LLM, a text retriever, and a document collection, all of which are readily accessible within the open-source ecosystem. Subsequently, LLMs are fine-tuned using the synthesized data to extend their context lengths. In this manner, we effectively transfer the short-context capabilities of LLMs to long-context scenarios through a bootstrapping process. We conduct experiments with the open-source Llama-3 family of models and demonstrate that our method can successfully extend the context length to up to 1M tokens, achieving superior performance across various benchmarks.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Collections
Sign up for free to add this paper to one or more collections.