Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation (2406.12634v1)

Published 18 Jun 2024 in cs.IR and cs.AI

Abstract: Rapidly growing numbers of multilingual news consumers pose an increasing challenge to news recommender systems in terms of providing customized recommendations. First, existing neural news recommenders, even when powered by multilingual LLMs (LMs), suffer substantial performance losses in zero-shot cross-lingual transfer (ZS-XLT). Second, the current paradigm of fine-tuning the backbone LM of a neural recommender on task-specific data is computationally expensive and infeasible in few-shot recommendation and cold-start setups, where data is scarce or completely unavailable. In this work, we propose a news-adapted sentence encoder (NaSE), domain-specialized from a pretrained massively multilingual sentence encoder (SE). To this end, we construct and leverage PolyNews and PolyNewsParallel, two multilingual news-specific corpora. With the news-adapted multilingual SE in place, we test the effectiveness of (i.e., question the need for) supervised fine-tuning for news recommendation, and propose a simple and strong baseline based on (i) frozen NaSE embeddings and (ii) late click-behavior fusion. We show that NaSE achieves state-of-the-art performance in ZS-XLT in true cold-start and few-shot news recommendation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Andreea Iana (11 papers)
  2. Fabian David Schmidt (11 papers)
  3. Goran Glavaš (82 papers)
  4. Heiko Paulheim (65 papers)
Citations (2)

Summary

Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation

The paper presents a focused investigation into the challenges of recommending multilingual news articles using neural news recommender systems (NNRs) enhanced with multilingual sentence embeddings (SEs). The authors highlight two main challenges: the performance degradation in zero-shot cross-lingual transfer (ZS-XLT) scenarios and the computational infeasibility of fine-tuning backbone LLMs (LMs) in low-data environments such as few-shot recommendation and cold-start setups.

Contributions

The key contributions of the paper include the development of a news-adapted sentence encoder (NaSE) derived from a pretrained massively multilingual SE, and the construction of two multilingual news-specific corpora: PolyNews and PolyNewsParallel. The authors propose a simplified, yet robust, baseline for news recommendation utilizing frozen NaSE embeddings combined with late click-behavior fusion.

Methodology

News-Adapted Sentence Encoder (NaSE)

The authors initiate NaSE from a general-purpose multilingual SE, LaBSE, and specialize it using denoising autoencoding (DAE) and machine translation (MT) objectives on the PolyNews and PolyNewsParallel corpora. Four distinct training strategies for NaSE are explored: DAE, MT, a combined DAE+MT, and sequential DAE followed by MT (NaSE\textsubscript{SEQ}).

Training Data and Process

PolyNews consists of approximately 3.9 million multilingual news texts across 77 languages. PolyNewsParallel, on the other hand, contains around 5.4 million news translations across 833 language pairs. The training data distribution is adjusted for language resource levels to ensure balanced learning. The NaSE variants are trained for 50,000 steps with a learning rate of 3e-5 using AdamW optimizer, with validation setup based on cross-lingual news recommendation tasks leveraging the xMIND dataset, which translates English MIND into 14 languages.

Evaluation

Neural News Recommenders (NNRs)

Seven diverse NNR architectures are evaluated:

  1. NAML
  2. MINS
  3. CAUM
  4. MANNeR
  5. LFRec-CE
  6. LFRec-SCL
  7. CAT (text-agnostic as baseline)

Results

The evaluation on the small variants of MIND and xMIND reveals that while SE-based NNRs outperform text-agnostic baselines, integrating NaSE as the NE results in superior performance over fine-tuned LaBSE and non-specialized multilingual LMs, especially when the NE remains frozen.

Key numerical results include:

  • NaSE achieves an nDCG@10 of 39.01% in English and 38.23% averaged across 14 xMIND languages in frozen NE configurations, illustrating the efficacy of news-specific domain adaptation.
  • NaSE consistently shows reduced performance losses in ZS-XLT scenarios compared to LaBSE, with relative improvements in ranking metrics such as MRR and nDCG@10.

The detailed assessment of few-shot learning scenarios (10, 50, and 100 shots) further underscores NaSE's robustness, where it consistently outperforms LaBSE, especially in extreme low-data setups.

Implications and Future Directions

Practically, this research highlights the feasibility of using robust domain-adapted SEs without extensive and computationally expensive fine-tuning. Theoretically, it opens new avenues for leveraging pretrained multilingual models for domain-specific tasks through task-agnostic adaptation strategies.

Future research may explore expanding NaSE's language coverage and enhancing the domain adaptation process using larger and more diverse news corpora. Investigating the integration of external user behavior or contextual signals could also further improve the accuracy and relevance of multilingual news recommendations.

The findings set a precedent for next-generation multilingual news recommenders, emphasizing efficiency and cross-lingual capability critical for real-world applications where resource constraints and language diversity pose significant challenges.