Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mind the Gap: Cross-Lingual Information Retrieval with Hierarchical Knowledge Enhancement (2112.13510v1)

Published 27 Dec 2021 in cs.IR and cs.LG

Abstract: Cross-Lingual Information Retrieval (CLIR) aims to rank the documents written in a language different from the user's query. The intrinsic gap between different languages is an essential challenge for CLIR. In this paper, we introduce the multilingual knowledge graph (KG) to the CLIR task due to the sufficient information of entities in multiple languages. It is regarded as a "silver bullet" to simultaneously perform explicit alignment between queries and documents and also broaden the representations of queries. And we propose a model named CLIR with hierarchical knowledge enhancement (HIKE) for our task. The proposed model encodes the textual information in queries, documents and the KG with multilingual BERT, and incorporates the KG information in the query-document matching process with a hierarchical information fusion mechanism. Particularly, HIKE first integrates the entities and their neighborhood in KG into query representations with a knowledge-level fusion, then combines the knowledge from both source and target languages to further mitigate the linguistic gap with a language-level fusion. Finally, experimental results demonstrate that HIKE achieves substantial improvements over state-of-the-art competitors.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Fuwei Zhang (6 papers)
  2. Zhao Zhang (250 papers)
  3. Xiang Ao (33 papers)
  4. Dehong Gao (26 papers)
  5. Fuzhen Zhuang (97 papers)
  6. Yi Wei (60 papers)
  7. Qing He (88 papers)
Citations (14)

Summary

We haven't generated a summary for this paper yet.