Data Annealing for Informal Language Understanding Tasks (2004.13833v1)
Abstract: There is a huge performance gap between formal and informal language understanding tasks. The recent pre-trained models that improved the performance of formal language understanding tasks did not achieve a comparable result on informal language. We pro-pose a data annealing transfer learning procedure to bridge the performance gap on informal natural language understanding tasks. It successfully utilizes a pre-trained model such as BERT in informal language. In our data annealing procedure, the training set contains mainly formal text data at first; then, the proportion of the informal text data is gradually increased during the training process. Our data annealing procedure is model-independent and can be applied to various tasks. We validate its effectiveness in exhaustive experiments. When BERT is implemented with our learning procedure, it outperforms all the state-of-the-art models on the three common informal language tasks.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.