Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BCWS: Bilingual Contextual Word Similarity (1810.08951v1)

Published 21 Oct 2018 in cs.CL

Abstract: This paper introduces the first dataset for evaluating English-Chinese Bilingual Contextual Word Similarity, namely BCWS (https://github.com/MiuLab/BCWS). The dataset consists of 2,091 English-Chinese word pairs with the corresponding sentential contexts and their similarity scores annotated by the human. Our annotated dataset has higher consistency compared to other similar datasets. We establish several baselines for the bilingual embedding task to benchmark the experiments. Modeling cross-lingual sense representations as provided in this dataset has the potential of moving artificial intelligence from monolingual understanding towards multilingual understanding.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Ta-Chung Chi (19 papers)
  2. Ching-Yen Shih (2 papers)
  3. Yun-Nung Chen (104 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com