Papers
Topics
Authors
Recent
2000 character limit reached

Bootstrap Your Own Context Length (2412.18860v2)

Published 25 Dec 2024 in cs.CL and cs.IR

Abstract: We introduce a bootstrapping approach to train long-context LLMs by exploiting their short-context capabilities only. Our method utilizes a simple agent workflow to synthesize diverse long-context instruction tuning data, thereby eliminating the necessity for manual data collection and annotation. The proposed data synthesis workflow requires only a short-context LLM, a text retriever, and a document collection, all of which are readily accessible within the open-source ecosystem. Subsequently, LLMs are fine-tuned using the synthesized data to extend their context lengths. In this manner, we effectively transfer the short-context capabilities of LLMs to long-context scenarios through a bootstrapping process. We conduct experiments with the open-source Llama-3 family of models and demonstrate that our method can successfully extend the context length to up to 1M tokens, achieving superior performance across various benchmarks.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 3 likes about this paper.