Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluating Knowledge-based Cross-lingual Inconsistency in Large Language Models (2407.01358v1)

Published 1 Jul 2024 in cs.CL

Abstract: This paper investigates the cross-lingual inconsistencies observed in LLMs, such as ChatGPT, Llama, and Baichuan, which have shown exceptional performance in various NLP tasks. Despite their successes, these models often exhibit significant inconsistencies when processing the same concepts across different languages. This study focuses on three primary questions: the existence of cross-lingual inconsistencies in LLMs, the specific aspects in which these inconsistencies manifest, and the correlation between cross-lingual consistency and multilingual capabilities of LLMs.To address these questions, we propose an innovative evaluation method for Cross-lingual Semantic Consistency (xSC) using the LaBSE model. We further introduce metrics for Cross-lingual Accuracy Consistency (xAC) and Cross-lingual Timeliness Consistency (xTC) to comprehensively assess the models' performance regarding semantic, accuracy, and timeliness inconsistencies. By harmonizing these metrics, we provide a holistic measurement of LLMs' cross-lingual consistency. Our findings aim to enhance the understanding and improvement of multilingual capabilities and interpretability in LLMs, contributing to the development of more robust and reliable multilingual LLMs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xiaolin Xing (1 paper)
  2. Zhiwei He (42 papers)
  3. Haoyu Xu (4 papers)
  4. Xing Wang (191 papers)
  5. Rui Wang (996 papers)
  6. Yu Hong (25 papers)