Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation (2010.02705v1)

Published 6 Oct 2020 in cs.CL, cs.AI, and cs.LG

Abstract: We propose a method to automatically generate a domain- and task-adaptive maskings of the given text for self-supervised pre-training, such that we can effectively adapt the LLM to a particular target task (e.g. question answering). Specifically, we present a novel reinforcement learning-based framework which learns the masking policy, such that using the generated masks for further pre-training of the target LLM helps improve task performance on unseen texts. We use off-policy actor-critic with entropy regularization and experience replay for reinforcement learning, and propose a Transformer-based policy network that can consider the relative importance of words in a given text. We validate our Neural Mask Generator (NMG) on several question answering and text classification datasets using BERT and DistilBERT as the LLMs, on which it outperforms rule-based masking strategies, by automatically learning optimal adaptive maskings.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Minki Kang (21 papers)
  2. Moonsu Han (3 papers)
  3. Sung Ju Hwang (178 papers)
Citations (16)

Summary

We haven't generated a summary for this paper yet.