Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Pre-trained Language Model with Lexical Simplification (2012.15070v1)

Published 30 Dec 2020 in cs.CL

Abstract: For both human readers and pre-trained LLMs (PrLMs), lexical diversity may lead to confusion and inaccuracy when understanding the underlying semantic meanings of given sentences. By substituting complex words with simple alternatives, lexical simplification (LS) is a recognized method to reduce such lexical diversity, and therefore to improve the understandability of sentences. In this paper, we leverage LS and propose a novel approach which can effectively improve the performance of PrLMs in text classification. A rule-based simplification process is applied to a given sentence. PrLMs are encouraged to predict the real label of the given sentence with auxiliary inputs from the simplified version. Using strong PrLMs (BERT and ELECTRA) as baselines, our approach can still further improve the performance in various text classification tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Rongzhou Bao (5 papers)
  2. Jiayi Wang (74 papers)
  3. Zhuosheng Zhang (125 papers)
  4. Hai Zhao (227 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.