Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 76 tok/s
Gemini 2.5 Pro 59 tok/s Pro
GPT-5 Medium 24 tok/s Pro
GPT-5 High 23 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 207 tok/s Pro
GPT OSS 120B 449 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Weaver: Foundation Models for Creative Writing (2401.17268v1)

Published 30 Jan 2024 in cs.CL, cs.AI, and cs.LG

Abstract: This work introduces Weaver, our first family of LLMs dedicated to content creation. Weaver is pre-trained on a carefully selected corpus that focuses on improving the writing capabilities of LLMs. We then fine-tune Weaver for creative and professional writing purposes and align it to the preference of professional writers using a suit of novel methods for instruction data synthesis and LLM alignment, making it able to produce more human-like texts and follow more diverse instructions for content creation. The Weaver family consists of models of Weaver Mini (1.8B), Weaver Base (6B), Weaver Pro (14B), and Weaver Ultra (34B) sizes, suitable for different applications and can be dynamically dispatched by a routing agent according to query complexity to balance response quality and computation cost. Evaluation on a carefully curated benchmark for assessing the writing capabilities of LLMs shows Weaver models of all sizes outperform generalist LLMs several times larger than them. Notably, our most-capable Weaver Ultra model surpasses GPT-4, a state-of-the-art generalist LLM, on various writing scenarios, demonstrating the advantage of training specialized LLMs for writing purposes. Moreover, Weaver natively supports retrieval-augmented generation (RAG) and function calling (tool usage). We present various use cases of these abilities for improving AI-assisted writing systems, including integration of external knowledge bases, tools, or APIs, and providing personalized writing assistance. Furthermore, we discuss and summarize a guideline and best practices for pre-training and fine-tuning domain-specific LLMs.

Citations (11)

Summary

  • The paper presents the Weaver models, engineered for creative and professional writing using a curated corpus and novel instruction backtranslation.
  • It details an ensemble of models ranging from 1.8B to 34B parameters that filter low-quality web content to enhance text generation.
  • Performance benchmarks on WriteBench show that the Weaver Ultra model outperforms larger models like GPT-4, underscoring its domain specialization.

Introduction

In the domain of LLMs, a new family of models emerges, explicitly architected for the task of creative and professional writing, collectively named Weaver. These models are differentiated by their training regimen—a meticulously curated corpus focused on high-quality creative content, such as novels and professional texts. Additionally, they incorporate a suite of innovative techniques for instruction data synthesis and model alignment, ensuring improved human-like text generation capabilities and diverse instructional follow-ups.

Model Development and Data Curation

Weaver's development pathway involves an ensemble of models varying in size: Mini (1.8B), Base (6B), Pro (14B), and Ultra (34B). This diversity caters to various use cases from light-touch engagement to computer-intensive quality generation. A novel aspect of their pre-training regimen is the filtering of low-quality web content, preferencing an intake of human-curated books, stories, articles, and other creative writings. The fine-tuning stage further benefits from innovative instruction backtranslation procedures, leveraging professional text to create corresponding high-quality instructions, thus reducing costs and raising the bar on annotated data quality.

Performance Benchmarks

Weaver's capabilities are rigorously tested against a custom-built benchmark called WriteBench, crafted to assess creative writing prowess in LLMs. With impressive results, the Weaver models, size-for-size, outperform their generalist counterparts, sometimes several magnitudes larger. In particular, the Weaver Ultra model surpasses GPT-4 on multiple dimensions of the writing domain, marking a significant accomplishment for specialized LLMs trained for writing.

Applications and Innovations

Beyond model training, the application of the technology finds its culmination in the WawaWriter platform—a human-AI collaborative environment. This innovative solution breaks new ground on how AI assists in the writing process, wherein users interface with AI in novel ways such as co-editing, personal knowledge base integration, personalized writing assistance, and even engaging in infinite long text generation through advanced prompt engineering techniques. WawaWriter aims to redefine and elevate the AI-assisted writing paradigm, presenting a holistic approach that combines breakthrough modeling with user-centric application design.

In summary, the Weaver family stands as a testament to the potential of LLMs when fine-tuned with intent towards a specific domain, with their success laying groundwork for future endeavors in domain-specific generative AI.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 18 posts and received 517 likes.

Youtube Logo Streamline Icon: https://streamlinehq.com

HackerNews

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube