Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 79 tok/s

Gemini 2.5 Pro 55 tok/s Pro

GPT-5 Medium 27 tok/s Pro

GPT-5 High 26 tok/s Pro

GPT-4o 85 tok/s Pro

GPT OSS 120B 431 tok/s Pro

Kimi K2 186 tok/s Pro

2000 character limit reached

Tower: An Open Multilingual Large Language Model for Translation-Related Tasks (2402.17733v1)

Published 27 Feb 2024 in cs.CL

Abstract: While general-purpose LLMs demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task. In this paper, we propose a recipe for tailoring LLMs to multiple tasks present in translation workflows. We perform continued pretraining on a multilingual mixture of monolingual and parallel data, creating TowerBase, followed by finetuning on instructions relevant for translation processes, creating TowerInstruct. Our final model surpasses open alternatives on several tasks relevant to translation workflows and is competitive with general-purpose closed LLMs. To facilitate future research, we release the Tower models, our specialization dataset, an evaluation framework for LLMs focusing on the translation ecosystem, and a collection of model generations, including ours, on our benchmark.

Citations (82)

View on Semantic Scholar

Collections

Summary

The paper introduces TOWER, an open multilingual LLM that enhances translation-related tasks through tailored pretraining and instruction-driven finetuning.
It leverages 20B tokens from 10 languages and the TOWER BLOCKS dataset to boost quality estimation, automatic post-editing, and grammatical error correction.
Benchmark results show TOWER INSTRUCT outperforms open alternatives and rivals closed models like GPT-4 across key evaluation metrics.

Introduction

In the continuously evolving landscape of multilingual NLP, the demand for systems that proficiently handle a variety of translation-related tasks -- like quality estimation, automatic post-edition, and grammatical error correction -- remains high. Recent advancements have spotlighted the use of general-purpose LLMs in setting new benchmarks across these tasks. However, a gap persists in the performance of open LLMs, particularly when catering to a range of tasks within translation workflows. "TOWER: An Open Multilingual LLM for Translation-Related Tasks" addresses this gap by introducing a tailored LLM that not only stands competitive against closed-source giants but also sets a new standard for open multilingual models across a spectrum of translation-related tasks.

TOWER: Design and Performance

TOWER is architected on three primary fronts:

TOWER BASE, which extends the multilingual capabilities of LLaMA-2 through continued pretraining on a mixture of monolingual and parallel data, encompassing a corpus of 20B tokens across 10 languages.
TOWER BLOCKS, a curated dataset aimed at finetuning LLMs for translation-related tasks through instruction-formed tasks.
TOWER INSTRUCT, the culminating model obtained after finetuning TOWER BASE on TOWER BLOCKS, designed for a high comprehension and execution of translation-related tasks.

Analyzing the performance through exhaustive benchmarks reveals that TOWER INSTRUCT consistently outperforms open alternatives and is fiercely competitive with the leading closed-source models, such as GPT-4 and GPT-3.5 turbo, across various metrics including COMET -22, BLEURT, and chrF. Noteworthy is its ability to excel in both directions of translation (source to target and vice versa) for languages included in its training corpus, highlighting its refined multilingual capabilities.

TOWER is meticulously evaluated against a wide array of translation tasks and related activities including automatic post-editing (APE) and named entity recognition (NER), where it showcases notable proficiency. Its adeptness in APE rectifies oscillatory hallucinations in translated texts, manifesting significant quality enhancements. Moreover, its prowess in NER across multiple languages underlines its effective instruction-following capacity, a testament to the diversity and quality considerations instilled in TOWER BLOCKS.

In the domain of grammatical error correction (GEC), though TOWER delivers promising results, it indicates a potential for further improvement, suggesting an avenue for expansion in future versions of the model.

The Importance of Parallel Data

A pivotal element in TOWER's development involves the integration of parallel data during its pretraining phase, a strategic move that significantly bolsters its translation quality. This approach underscores the utility of incorporating cross-lingual signals early in model development, a practice that presents a considerable sample efficiency and continues to yield translation quality improvements with increasing data volume.

Conclusion and Future Directions

TOWER marks a significant stride towards refining the utility and accessibility of open LLMs for multilingual translation tasks. By harnessing the nuanced complexities of translation workflows through a structured training and evaluation pipeline, TOWER stands as a robust framework for future explorations in enhancing translation quality and related processes.

Released alongside the model are the TOWER family, TOWER BLOCKS, and TOWER EVAL – comprehensive resources that ensure reproducibility and encourage further research. Such contributions are pivotal for the broader NLP community, fostering advancements in multilingual processing and translation task efficiencies.

As TOWER navigates the challenges and intricacies of multilingual NLP, its development trajectory illuminates potential enhancements in handling longer contexts and exploring complex task interrelationships. With its open-source model and expansive dataset, TOWER not only elevates the benchmark for translation-related tasks but also propels forward the dialogue on the development of versatile, multilingual LLMs.

PDF Markdown

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Authors (13)

Tweets

https://twitter.com/nunonmg/status/1762778781893234933

https://twitter.com/MarineCarpuat/status/1843233892083622073

https://twitter.com/arxivsanitybot/status/1763021157358350729