Is Multilingual BERT Fluent in Language Generation? (1910.03806v1)
Abstract: The multilingual BERT model is trained on 104 languages and meant to serve as a universal LLM and tool for encoding sentences. We explore how well the model performs on several languages across several tasks: a diagnostic classification probing the embeddings for a particular syntactic property, a cloze task testing the LLMling ability to fill in gaps in a sentence, and a natural language generation task testing for the ability to produce coherent text fitting a given context. We find that the currently available multilingual BERT model is clearly inferior to the monolingual counterparts, and cannot in many cases serve as a substitute for a well-trained monolingual model. We find that the English and German models perform well at generation, whereas the multilingual model is lacking, in particular, for Nordic languages.
- Samuel Rönnqvist (14 papers)
- Jenna Kanerva (17 papers)
- Tapio Salakoski (9 papers)
- Filip Ginter (28 papers)