Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

s2s-ft: Fine-Tuning Pretrained Transformer Encoders for Sequence-to-Sequence Learning (2110.13640v1)

Published 26 Oct 2021 in cs.CL

Abstract: Pretrained bidirectional Transformers, such as BERT, have achieved significant improvements in a wide variety of language understanding tasks, while it is not straightforward to directly apply them for natural language generation. In this paper, we present a sequence-to-sequence fine-tuning toolkit s2s-ft, which adopts pretrained Transformers for conditional generation tasks. Inspired by UniLM, we implement three sequence-to-sequence fine-tuning algorithms, namely, causal fine-tuning, masked fine-tuning, and pseudo-masked fine-tuning. By leveraging the existing pretrained bidirectional Transformers, experimental results show that s2s-ft achieves strong performance on several benchmarks of abstractive summarization, and question generation. Moreover, we demonstrate that the package s2s-ft supports both monolingual and multilingual NLG tasks. The s2s-ft toolkit is available at https://github.com/microsoft/unilm/tree/master/s2s-ft.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Hangbo Bao (17 papers)
  2. Li Dong (154 papers)
  3. Wenhui Wang (47 papers)
  4. Nan Yang (182 papers)
  5. Furu Wei (291 papers)
Citations (11)
Github Logo Streamline Icon: https://streamlinehq.com