Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Pre-training with Meta Learning for Chinese Word Segmentation (2010.12272v2)

Published 23 Oct 2020 in cs.CL

Abstract: Recent researches show that pre-trained models (PTMs) are beneficial to Chinese Word Segmentation (CWS). However, PTMs used in previous works usually adopt LLMing as pre-training tasks, lacking task-specific prior segmentation knowledge and ignoring the discrepancy between pre-training tasks and downstream CWS tasks. In this paper, we propose a CWS-specific pre-trained model METASEG, which employs a unified architecture and incorporates meta learning algorithm into a multi-criteria pre-training task. Empirical results show that METASEG could utilize common prior segmentation knowledge from different existing criteria and alleviate the discrepancy between pre-trained models and downstream CWS tasks. Besides, METASEG can achieve new state-of-the-art performance on twelve widely-used CWS datasets and significantly improve model performance in low-resource settings.

Summary

We haven't generated a summary for this paper yet.