Retrofitting Structure-aware Transformer Language Model for End Tasks

Published 16 Sep 2020 in cs.CL | (2009.07408v1)

Abstract: We consider retrofitting structure-aware Transformer-based LLM for facilitating end tasks by proposing to exploit syntactic distance to encode both the phrasal constituency and dependency connection into the LLM. A middle-layer structural learning strategy is leveraged for structure integration, accomplished with main semantic task training under multi-task learning scheme. Experimental results show that the retrofitted structure-aware Transformer LLM achieves improved perplexity, meanwhile inducing accurate syntactic phrases. By performing structure-aware fine-tuning, our model achieves significant improvements for both semantic- and syntactic-dependent tasks.