BERTGEN: Multi-task Generation through BERT (2106.03484v1)

Published 7 Jun 2021 in cs.CL

Abstract: We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively. BERTGEN is auto-regressively trained for language generation tasks, namely image captioning, machine translation and multimodal machine translation, under a multitask setting. With a comprehensive set of evaluations, we show that BERTGEN outperforms many strong baselines across the tasks explored. We also show BERTGEN's ability for zero-shot language generation, where it exhibits competitive performance to supervised counterparts. Finally, we conduct ablation studies which demonstrate that BERTGEN substantially benefits from multi-tasking and effectively transfers relevant inductive biases from the pre-trained models.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (4)

Faidon Mitzalis (1 paper)
Ozan Caglayan (20 papers)
Pranava Madhyastha (37 papers)
Lucia Specia (68 papers)

Citations (6)

View on Semantic Scholar

BERTGEN: Multi-task Generation through BERT (2106.03484v1)

Related Papers