Papers
Topics
Authors
Recent
Search
2000 character limit reached

Prompting and Fine-Tuning of Small LLMs for Length-Controllable Telephone Call Summarization

Published 24 Oct 2024 in cs.CL, cs.AI, and cs.LG | (2410.18624v1)

Abstract: This paper explores the rapid development of a telephone call summarization system utilizing LLMs. Our approach involves initial experiments with prompting existing LLMs to generate summaries of telephone conversations, followed by the creation of a tailored synthetic training dataset utilizing stronger frontier models. We place special focus on the diversity of the generated data and on the ability to control the length of the generated summaries to meet various use-case specific requirements. The effectiveness of our method is evaluated using two state-of-the-art LLM-as-a-judge-based evaluation techniques to ensure the quality and relevance of the summaries. Our results show that fine-tuned Llama-2-7B-based summarization model performs on-par with GPT-4 in terms of factual accuracy, completeness and conciseness. Our findings demonstrate the potential for quickly bootstrapping a practical and efficient call summarization system.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. OpenAI, “Gpt-4 technical report,” 2023.
  2. Meta, “The llama 3 herd of models,” 2024. [Online]. Available: https://arxiv.org/abs/2407.21783
  3. H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale et al., “Llama 2: Open foundation and fine-tuned chat models,” ArXiv preprint, vol. abs/2307.09288, 2023. [Online]. Available: https://arxiv.org/abs/2307.09288
  4. B. Chen, Z. Zhang, N. Langrené, and S. Zhu, “Unleashing the potential of prompt engineering in large language models: a comprehensive review,” 2024. [Online]. Available: https://arxiv.org/abs/2310.14735
  5. D. Thulke, Y. Gao, P. Pelser, R. Brune, R. Jalota, F. Fok, M. Ramos, I. van Wyk, A. Nasir, H. Goldstein, T. Tragemann, K. Nguyen, A. Fowler, A. Stanco, J. Gabriel, J. Taylor, D. Moro, E. Tsymbalov, J. de Waal, E. Matusov, M. Yaghi, M. Shihadah, H. Ney, C. Dugast, J. Dotan, and D. Erasmus, “Climategpt: Towards ai synthesizing interdisciplinary research on climate change,” 2024. [Online]. Available: https://arxiv.org/abs/2401.09646
  6. A. Köpf, Y. Kilcher, D. von Rütte, S. Anagnostidis, Z. R. Tam, K. Stevens, A. Barhoum, D. M. Nguyen, O. Stanley, R. Nagyfi, S. ES, S. Suri, D. A. Glushkov, A. V. Dantuluri, A. Maguire, C. Schuhmann, H. Nguyen, and A. J. Mattick, “Openassistant conversations - democratizing large language model alignment,” in Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023. [Online]. Available: https://openreview.net/forum?id=VSJotgbPHF
  7. Y. Wang, H. Ivison, P. Dasigi, J. Hessel, T. Khot, K. Chandu, D. Wadden, K. MacMillan, N. A. Smith, I. Beltagy, and H. Hajishirzi, “How far can camels go? exploring the state of instruction tuning on open resources,” in Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023. [Online]. Available: https://openreview.net/forum?id=w4zZNC4ZaV
  8. Y. Wang, H. Li, X. Han, P. Nakov, and T. Baldwin, “Do-not-answer: Evaluating safeguards in LLMs,” in Findings of the Association for Computational Linguistics: EACL 2024, Y. Graham and M. Purver, Eds.   St. Julian’s, Malta: Association for Computational Linguistics, Mar. 2024, pp. 896–911. [Online]. Available: https://aclanthology.org/2024.findings-eacl.61
  9. H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” ArXiv preprint, vol. abs/2302.13971, 2023. [Online]. Available: https://arxiv.org/abs/2302.13971
  10. D. Narayanan, M. Shoeybi, J. Casper, P. LeGresley, M. Patwary, V. Korthikanti, D. Vainbrand, P. Kashinkunti, J. Bernauer, B. Catanzaro, A. Phanishayee, and M. Zaharia, “Efficient large-scale language model training on gpu clusters using megatron-lm,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC ’21.   New York, NY, USA: Association for Computing Machinery, 2021. [Online]. Available: https://doi.org/10.1145/3458817.3476209
  11. A. H. Cano, M. Pagliardini, A. Köpf, K. Matoba, A. Mohtashami, O. S. Fan, A. Marmet, D. Bayazit, I. Krawczuk, Z. Chen, F. Salvi, A. Bosselut, and M. Jaggi, “epfllm megatron-lm,” 2023. [Online]. Available: https://github.com/epfLLM/Megatron-LLM
  12. Z. Chen, A. H. Cano, A. Romanou, A. Bonnet, K. Matoba, F. Salvi, M. Pagliardini, S. Fan, A. Köpf, A. Mohtashami, A. Sallinen, A. Sakhaeirad, V. Swamy, I. Krawczuk, D. Bayazit, A. Marmet, S. Montariol, M.-A. Hartley, M. Jaggi, and A. Bosselut, “Meditron-70b: Scaling medical pretraining for large language models,” 2023. [Online]. Available: https://arxiv.org/abs/2311.16079
  13. C. Zhou, P. Liu, P. Xu, S. Iyer, J. Sun, Y. Mao, X. Ma, A. Efrat, P. Yu, L. YU, S. Zhang, G. Ghosh, M. Lewis, L. Zettlemoyer, and O. Levy, “LIMA: Less is more for alignment,” 2023. [Online]. Available: https://openreview.net/forum?id=KBMOKmX2he
  14. S. Kim, J. Suk, S. Longpre, B. Y. Lin, J. Shin, S. Welleck, G. Neubig, M. Lee, K. Lee, and M. Seo, “Prometheus 2: An open source language model specialized in evaluating other language models,” 2024. [Online]. Available: https://arxiv.org/abs/2405.01535
  15. H. Song, H. Su, I. Shalyminov, J. Cai, and S. Mansour, “FineSurE: Fine-grained summarization evaluation using LLMs,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), L.-W. Ku, A. Martins, and V. Srikumar, Eds.   Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 906–922. [Online]. Available: https://aclanthology.org/2024.acl-long.51
  16. A. Tamura, K. Ishikawa, M. Saikou, and M. Tsuchida, “Extractive summarization method for contact center dialogues based on call logs,” in Proceedings of 5th International Joint Conference on Natural Language Processing, H. Wang and D. Yarowsky, Eds.   Chiang Mai, Thailand: Asian Federation of Natural Language Processing, Nov. 2011, pp. 500–508. [Online]. Available: https://aclanthology.org/I11-1056
  17. B. Favre, E. Stepanov, J. Trione, F. Béchet, and G. Riccardi, “Call centre conversation summarization: A pilot task at multiling 2015,” in Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, A. Koller, G. Skantze, F. Jurcicek, M. Araki, and C. P. Rose, Eds.   Prague, Czech Republic: Association for Computational Linguistics, Sep. 2015, pp. 232–236. [Online]. Available: https://aclanthology.org/W15-4633
  18. T. Goyal, J. J. Li, and G. Durrett, “News summarization and evaluation in the era of gpt-3,” 2022. [Online]. Available: https://arxiv.org/abs/2209.12356
  19. T. Zhang, F. Ladhak, E. Durmus, P. Liang, K. McKeown, and T. B. Hashimoto, “Benchmarking large language models for news summarization,” Transactions of the Association for Computational Linguistics, vol. 12, pp. 39–57, 2024. [Online]. Available: https://aclanthology.org/2024.tacl-1.3
  20. S. Ramprasad, E. Ferracane, and Z. Lipton, “Analyzing LLM behavior in dialogue summarization: Unveiling circumstantial hallucination trends,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), L.-W. Ku, A. Martins, and V. Srikumar, Eds.   Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 12 549–12 561. [Online]. Available: https://aclanthology.org/2024.acl-long.677
  21. Y. Chen, Y. Liu, L. Chen, and Y. Zhang, “DialogSum: A real-life scenario dialogue summarization dataset,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, C. Zong, F. Xia, W. Li, and R. Navigli, Eds.   Online: Association for Computational Linguistics, Aug. 2021, pp. 5062–5074. [Online]. Available: https://aclanthology.org/2021.findings-acl.449
  22. A. Asi, S. Wang, R. Eisenstadt, D. Geckt, Y. Kuper, Y. Mao, and R. Ronen, “An end-to-end dialogue summarization system for sales calls,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, A. Loukina, R. Gangadharaiah, and B. Min, Eds.   Hybrid: Seattle, Washington + Online: Association for Computational Linguistics, Jul. 2022, pp. 45–53. [Online]. Available: https://aclanthology.org/2022.naacl-industry.6
  23. A. Mullick, S. Bose, R. Saha, A. Bhowmick, P. Goyal, N. Ganguly, P. Dey, and R. Kokku, “On the persona-based summarization of domain-specific documents,” in Findings of the Association for Computational Linguistics ACL 2024, L.-W. Ku, A. Martins, and V. Srikumar, Eds.   Bangkok, Thailand and virtual meeting: Association for Computational Linguistics, Aug. 2024, pp. 14 291–14 307. [Online]. Available: https://aclanthology.org/2024.findings-acl.849
  24. A. Fan, D. Grangier, and M. Auli, “Controllable abstractive summarization,” in Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, A. Birch, A. Finch, T. Luong, G. Neubig, and Y. Oda, Eds.   Melbourne, Australia: Association for Computational Linguistics, Jul. 2018, pp. 45–54. [Online]. Available: https://aclanthology.org/W18-2706
  25. R. Jie, X. Meng, L. Shang, X. Jiang, and Q. Liu, “Prompt-based length controlled generation with multiple control types,” in Findings of the Association for Computational Linguistics ACL 2024, L.-W. Ku, A. Martins, and V. Srikumar, Eds.   Bangkok, Thailand and virtual meeting: Association for Computational Linguistics, Aug. 2024, pp. 1067–1085. [Online]. Available: https://aclanthology.org/2024.findings-acl.63
  26. W. Yuan, I. Kulikov, P. Yu, K. Cho, S. Sukhbaatar, J. Weston, and J. Xu, “Following length constraints in instructions,” 2024. [Online]. Available: https://arxiv.org/abs/2406.17744
  27. S. Wu, O. Irsoy, S. Lu, V. Dabravolski, M. Dredze, S. Gehrmann, P. Kambadur, D. Rosenberg, and G. Mann, “Bloomberggpt: A large language model for finance,” 2023. [Online]. Available: https://arxiv.org/abs/2303.17564

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.