Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Baichuan2-Sum: Instruction Finetune Baichuan2-7B Model for Dialogue Summarization (2401.15496v3)

Published 27 Jan 2024 in cs.CL, cs.AI, and cs.LG

Abstract: LLMs like Llama, Baichuan and Bloom models show remarkable ability with instruction fine-tuning in many natural language tasks. Nevertheless, for the dialogue summarization task, which aims to generate summaries for different roles in dialogue, most of the state-of-the-art methods conduct on small models (e.g Bart and Bert). Existing methods try to add task specified optimization on small models like adding global-local centrality score to models. In this paper, we propose an instruction fine-tuning model: Baichuan2-Sum, for role-oriented diaglouge summarization. By setting different instructions for different roles, the model can learn from the dialogue interactions and output the expected summaries. Furthermore, we applied NEFTune technique to add suitable noise during training to improve the results. The experiments demonstrate that the proposed model achieves the new state-of-the-art results on two public dialogue summarization datasets: CSDS and SAMSUM. We release our model and related codes to facilitate future studies on dialogue summarization task.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jianfei Xiao (3 papers)
  2. Yancan Chen (2 papers)
  3. Yimin Ou (2 papers)
  4. Hanyi Yu (7 papers)
  5. Yiyong Xiao (1 paper)
  6. Kai Shu (88 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.