PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning (2311.08711v2)

Published 15 Nov 2023 in cs.CL

Abstract: Instruction tuning has remarkably advanced LLMs in understanding and responding to diverse human instructions. Despite the success in high-resource languages, its application in lower-resource ones faces challenges due to the imbalanced foundational abilities of LLMs across different languages, stemming from the uneven language distribution in their pre-training data. To tackle this issue, we propose pivot language guided generation (PLUG), an approach that utilizes a high-resource language, primarily English, as the pivot to enhance instruction tuning in lower-resource languages. It trains the model to first process instructions in the pivot language, and then produce responses in the target language. To evaluate our approach, we introduce a benchmark, X-AlpacaEval, of instructions in 4 languages (Chinese, Korean, Italian, and Spanish), each annotated by professional translators. Our approach demonstrates a significant improvement in the instruction-following abilities of LLMs by 29% on average, compared to directly responding in the target language alone. Further experiments validate the versatility of our approach by employing alternative pivot languages beyond English to assist languages where LLMs exhibit lower proficiency. Our code and data are available at https://github.com/ytyz1307zzh/PLUG.

PDF Abstract

Leveraging Pivot Language in Cross-Lingual Instruction Tuning

The paper "Leveraging Pivot Language in Cross-Lingual Instruction Tuning" addresses the challenges associated with instruction tuning in LLMs for low-resource languages. This research is situated within the context of the uneven language distribution in pre-training data, which hinders LLM performance across languages. The authors propose an approach named Pivot Language Guided Generation (PLUG), which utilizes a high-resource language as a pivot to enhance instruction-tuning capabilities in target languages with fewer resources.

Core Concepts

Instruction tuning has proven vital for improving LLMs' abilities to understand and execute diverse human instructions, mainly benefiting high-resource languages like English. However, extending these capabilities to lower-resource languages demands innovative solutions due to the inherent disparities in foundational language proficiency. PLUG aims to address this by facilitating a two-step response generation process: first leveraging the LLM’s strength in a high-resource language, followed by rendering this understanding in the target language.

Methodology

The PLUG method involves training LLMs using bilingual data representation:

Pivot Step: Process instructions in a high-resource pivot language (e.g., English).
Target Step: Generate responses in the target language.

To assess this approach, the authors introduce X-AlpacaEval, a new benchmark containing professionally translated instructions across four languages: Chinese, Korean, Italian, and Spanish. This dataset assists in quantitatively evaluating improvements in instruction-following capabilities.

Strong Results

The paper presents significant performance gains:

A 29% average improvement in instruction-following capability in target languages using English as the pivot.
Enhanced proficiency even with alternative pivot languages, indicating flexibility and adaptability of the methodology.

Such results underscore PLUG's efficacy in leveraging existing LLM strengths to support instruction tuning across languages with limited resources. The paper's experiments validate the robustness of this approach using both fundamentally multilingual models (like PolyLM) and English-centric models (like LLaMA-2).

Practical and Theoretical Implications

Practically, PLUG could revolutionize the adaptation of AI technologies in multilingual contexts, particularly for languages lacking substantial training data. Theoretically, it offers insight into leveraging linguistic relationships and cognitive strategies in AI model training. It also opens avenues for further research into dynamic language pivoting based on context or specific task requirements.

Future Developments

The paper hints at potential expansions, such as:

Exploring other intrinsic attributes of languages to determine pivot suitability beyond mere resource availability.
Integrating PLUG with emerging multilingual LLMs tuned on diverse datasets.

Such explorations could refine global AI assistants and improve their accessibility and effectiveness across culturally and linguistically diverse environments.

Conclusion

The introduction of PLUG provides a strategic framework for overcoming language disparities in instruction tuning. By tactfully applying high-resource language capabilities, this research not only demonstrates a feasible method for enhancing LLM outputs in low-resource languages but also broadens the horizons for AI deployment in a multilingual global landscape.