Papers
Topics
Authors
Recent
Search
2000 character limit reached

GPT-Calls: Enhancing Call Segmentation and Tagging by Generating Synthetic Conversations via Large Language Models

Published 9 Jun 2023 in cs.CL and cs.LG | (2306.07941v1)

Abstract: Transcriptions of phone calls are of significant value across diverse fields, such as sales, customer service, healthcare, and law enforcement. Nevertheless, the analysis of these recorded conversations can be an arduous and time-intensive process, especially when dealing with extended or multifaceted dialogues. In this work, we propose a novel method, GPT-distilled Calls Segmentation and Tagging (GPT-Calls), for efficient and accurate call segmentation and topic extraction. GPT-Calls is composed of offline and online phases. The offline phase is applied once to a given list of topics and involves generating a distribution of synthetic sentences for each topic using a GPT model and extracting anchor vectors. The online phase is applied to every call separately and scores the similarity between the transcripted conversation and the topic anchors found in the offline phase. Then, time domain analysis is applied to the similarity scores to group utterances into segments and tag them with topics. The proposed paradigm provides an accurate and efficient method for call segmentation and topic extraction that does not require labeled data, thus making it a versatile approach applicable to various domains. Our algorithm operates in production under Dynamics 365 Sales Conversation Intelligence, and our research is based on real sales conversations gathered from various Dynamics 365 Sales tenants.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  2. “Training language models to follow instructions with human feedback,” arXiv preprint arXiv:2203.02155, 2022.
  3. “Text segmentation based on semantic word embeddings,” arXiv preprint arXiv:1503.05543, 2015.
  4. “Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. arxiv e-prints,” 2019.
  5. Marti A Hearst, “Text tiling: Segmenting text into multi-paragraph subtopic passages,” Computational linguistics, vol. 23, no. 1, pp. 33–64, 1997.
  6. “Unsupervised topic segmentation of meetings with bert embeddings,” arXiv preprint arXiv:2106.12978, 2021.
  7. “Sentence-bert: Sentence embeddings using siamese bert-networks,” arXiv preprint arXiv:1908.10084, 2019.
  8. “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  9. “A density-based algorithm for discovering clusters in large spatial databases with noise.,” in kdd, 1996, vol. 96, pp. 226–231.
  10. “Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach,” arXiv preprint arXiv:1909.00161, 2019.
  11. “An end-to-end dialogue summarization system for sales calls,” arXiv preprint arXiv:2204.12951, 2022.
  12. “Statistical models for text segmentation,” Machine learning, vol. 34, no. 1-3, pp. 177–210, 1999.
  13. “A critique and improvement of an evaluation metric for text segmentation,” Computational Linguistics, vol. 28, no. 1, pp. 19–36, 2002.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.