Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cognition-aware Cognate Detection (2112.08087v1)

Published 15 Dec 2021 in cs.CL and cs.AI

Abstract: Automatic detection of cognates helps downstream NLP tasks of Machine Translation, Cross-lingual Information Retrieval, Computational Phylogenetics and Cross-lingual Named Entity Recognition. Previous approaches for the task of cognate detection use orthographic, phonetic and semantic similarity based features sets. In this paper, we propose a novel method for enriching the feature sets, with cognitive features extracted from human readers' gaze behaviour. We collect gaze behaviour data for a small sample of cognates and show that extracted cognitive features help the task of cognate detection. However, gaze data collection and annotation is a costly task. We use the collected gaze behaviour data to predict cognitive features for a larger sample and show that predicted cognitive features, also, significantly improve the task performance. We report improvements of 10% with the collected gaze features, and 12% using the predicted gaze features, over the previously proposed approaches. Furthermore, we release the collected gaze behaviour data along with our code and cross-lingual models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Diptesh Kanojia (58 papers)
  2. Prashant Sharma (17 papers)
  3. Sayali Ghodekar (2 papers)
  4. Pushpak Bhattacharyya (153 papers)
  5. Gholamreza Haffari (141 papers)
  6. Malhar Kulkarni (7 papers)
Citations (10)

Summary

We haven't generated a summary for this paper yet.