Testing pre-trained Transformer models for Lithuanian news clustering (2004.03461v1)

Published 3 Apr 2020 in cs.IR, cs.CL, and cs.LG

Abstract: A recent introduction of Transformer deep learning architecture made breakthroughs in various natural language processing tasks. However, non-English languages could not leverage such new opportunities with the English text pre-trained models. This changed with research focusing on multilingual models, where less-spoken languages are the main beneficiaries. We compare pre-trained multilingual BERT, XLM-R, and older learned text representation methods as encodings for the task of Lithuanian news clustering. Our results indicate that publicly available pre-trained multilingual Transformer models can be fine-tuned to surpass word vectors but still score much lower than specially trained doc2vec embeddings.

View on arXiv

Authors (2)

Lukas Stankevičius (6 papers)
Mantas Lukoševičius (13 papers)

Citations (6)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Testing pre-trained Transformer models for Lithuanian news clustering (2004.03461v1)

Summary

Related Papers