Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adversarial Learning with Contextual Embeddings for Zero-resource Cross-lingual Classification and NER (1909.00153v3)

Published 31 Aug 2019 in cs.CL and cs.LG

Abstract: Contextual word embeddings (e.g. GPT, BERT, ELMo, etc.) have demonstrated state-of-the-art performance on various NLP tasks. Recent work with the multilingual version of BERT has shown that the model performs very well in zero-shot and zero-resource cross-lingual settings, where only labeled English data is used to finetune the model. We improve upon multilingual BERT's zero-resource cross-lingual performance via adversarial learning. We report the magnitude of the improvement on the multilingual MLDoc text classification and CoNLL 2002/2003 named entity recognition tasks. Furthermore, we show that language-adversarial training encourages BERT to align the embeddings of English documents and their translations, which may be the cause of the observed performance gains.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Phillip Keung (11 papers)
  2. Yichao Lu (22 papers)
  3. Vikas Bhardwaj (9 papers)
Citations (80)