YAYI-UIE: A Chat-Enhanced Instruction Tuning Framework for Universal Information Extraction (2312.15548v3)
Abstract: The difficulty of the information extraction task lies in dealing with the task-specific label schemas and heterogeneous data structures. Recent work has proposed methods based on LLMs to uniformly model different information extraction tasks. However, these existing methods are deficient in their information extraction capabilities for Chinese languages other than English. In this paper, we propose an end-to-end chat-enhanced instruction tuning framework for universal information extraction (YAYI-UIE), which supports both Chinese and English. Specifically, we utilize dialogue data and information extraction data to enhance the information extraction performance jointly. Experimental results show that our proposed framework achieves state-of-the-art performance on Chinese datasets while also achieving comparable performance on English datasets under both supervised settings and zero-shot settings.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712.
- Chih-Yao Chen and Cheng-Te Li. 2021. ZS-BERT: Towards zero-shot relation extraction with attribute representation learning. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3470–3479, Online. Association for Computational Linguistics.
- Glm: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 320–335.
- Ralph Grishman. 2019. Twenty-five years of information extraction. Natural Language Engineering, 25(6):677–692.
- Instructie: A chinese instruction-based information extraction dataset. arXiv preprint arXiv:2305.11527.
- How close is chatgpt to human experts? comparison corpus, evaluation, and detection. arXiv preprint arXiv:2301.07597.
- Duee-fin: A large-scale dataset for document-level event extraction. In CCF International Conference on Natural Language Processing and Chinese Computing, pages 172–183. Springer.
- FewRel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4803–4809, Brussels, Belgium. Association for Computational Linguistics.
- Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of naacL-HLT, volume 1, page 2.
- Zero-shot triplet extraction by template infilling. arXiv preprint arXiv:2212.10708.
- CrudeOilNews: An annotated crude oil news corpus for event extraction. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 465–479, Marseille, France. European Language Resources Association.
- Gina-Anne Levow. 2006. The third international Chinese language processing bakeoff: Word segmentation and named entity recognition. In Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pages 108–117, Sydney, Australia. Association for Computational Linguistics.
- Duie: A large-scale chinese dataset for information extraction. In Natural Language Processing and Chinese Computing: 8th CCF International Conference, NLPCC 2019, Dunhuang, China, October 9–14, 2019, Proceedings, Part II 8, pages 791–800. Springer.
- Duee: a large-scale dataset for chinese event extraction in real-world scenarios. In Natural Language Processing and Chinese Computing: 9th CCF International Conference, NLPCC 2020, Zhengzhou, China, October 14–18, 2020, Proceedings, Part II 9, pages 534–545. Springer.
- Zhi Liu. 2011. Amazon Commerce reviews set. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C55C88.
- Crossner: Evaluating cross-domain named entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 13452–13460.
- Universal information extraction as unified semantic matching. arXiv preprint arXiv:2301.03282.
- Unified structure generation for universal information extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5755–5772.
- DUTIR at the CCKS-2018 task1: A neural network ensemble approach for chinese clinical named entity recognition. In CEUR Workshop Proceedings, volume 2242, pages 7–12.
- Coarse-to-Fine Pre-training for Named Entity Recognition. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 6345–6354. Association for Computational Linguistics.
- OpenAI. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Nanyun Peng and Mark Dredze. 2015. Named entity recognition for Chinese social media with jointly trained embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 548–554, Lisbon, Portugal. Association for Computational Linguistics.
- Is chatgpt a general-purpose natural language processing task solver? arXiv preprint arXiv:2302.06476.
- ERICA: Improving entity and relation understanding for pre-trained language models via contrastive learning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3350–3363. Association for Computational Linguistics.
- Llama: open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Llama 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Ipre: a dataset for inter-personal relationship extraction. In Natural Language Processing and Chinese Computing: 8th CCF International Conference, NLPCC 2019, Dunhuang, China, October 9–14, 2019, Proceedings, Part II 8, pages 103–115. Springer.
- MINER: Improving out-of-vocabulary named entity recognition from an information theoretic perspective. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5590–5600. Association for Computational Linguistics.
- InstructUIE: Multi-task instruction tuning for unified information extraction.
- Self-Instruct: Aligning language model with self generated instructions.
- CLEVE: Contrastive Pre-training for Event Extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6283–6297. Association for Computational Linguistics.
- Clinical named entity recognition: Ecust in the ccks-2017 shared task 2. In CEUR workshop proceedings, volume 1976, pages 43–48.
- CLUE: A Chinese language understanding evaluation benchmark. In Proceedings of the 28th International Conference on Computational Linguistics, pages 4762–4772, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Baichuan 2: Open large-scale language models.
- Knowlm technical report.
- What the role is vs. what plays the role: Semi-supervised event argument extraction via dual question answering. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 14638–14646.
- Xinglin Xiao (2 papers)
- Yijie Wang (54 papers)
- Nan Xu (83 papers)
- Yuqi Wang (62 papers)
- Hanxuan Yang (6 papers)
- Minzheng Wang (9 papers)
- Yin Luo (5 papers)
- Lei Wang (975 papers)
- Wenji Mao (13 papers)
- Daniel Zeng (18 papers)