Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CTQScorer: Combining Multiple Features for In-context Example Selection for Machine Translation (2305.14105v2)

Published 23 May 2023 in cs.CL and cs.AI

Abstract: LLMs have demonstrated the capability to perform on machine translation when the input is prompted with a few examples (in-context learning). Translation quality depends on various features of the selected examples, such as their quality and relevance, but previous work has predominantly focused on individual features in isolation. In this paper, we propose a general framework for combining different features influencing example selection. We learn a regression model, CTQ Scorer (Contextual Translation Quality), that selects examples based on multiple features in order to maximize the translation quality. On multiple language pairs and LLMs, we show that CTQ Scorer helps significantly outperform random selection as well as strong single-factor baselines reported in the literature. We also see an improvement of over 2.5 COMET points on average with respect to a strong BM25 retrieval-based baseline.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Aswanth Kumar (3 papers)
  2. Ratish Puduppully (20 papers)
  3. Raj Dabre (65 papers)
  4. Anoop Kunchukuttan (45 papers)
Citations (11)