"Ask Me Anything": How Comcast Uses LLMs to Assist Agents in Real Time (2405.00801v2)
Abstract: Customer service is how companies interface with their customers. It can contribute heavily towards the overall customer satisfaction. However, high-quality service can become expensive, creating an incentive to make it as cost efficient as possible and prompting most companies to utilize AI-powered assistants, or "chat bots". On the other hand, human-to-human interaction is still desired by customers, especially when it comes to complex scenarios such as disputes and sensitive topics like bill payment. This raises the bar for customer service agents. They need to accurately understand the customer's question or concern, identify a solution that is acceptable yet feasible (and within the company's policy), all while handling multiple conversations at once. In this work, we introduce "Ask Me Anything" (AMA) as an add-on feature to an agent-facing customer service interface. AMA allows agents to ask questions to a LLM on demand, as they are handling customer conversations -- the LLM provides accurate responses in real-time, reducing the amount of context switching the agent needs. In our internal experiments, we find that agents using AMA versus a traditional search experience spend approximately 10% fewer seconds per conversation containing a search, translating to millions of dollars of savings annually. Agents that used the AMA feature provided positive feedback nearly 80% of the time, demonstrating its usefulness as an AI-assisted feature for customer care.
- InPars: Data Augmentation for Information Retrieval using Large Language Models. arXiv:2202.05144 [cs.CL]
- Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine Learning (Bonn, Germany) (ICML ’05). Association for Computing Machinery, New York, NY, USA, 89–96. https://doi.org/10.1145/1102351.1102363
- Promptagator: Few-shot Dense Retrieval From 8 Examples. arXiv:2209.11755 [cs.CL]
- Applying the Delta Method in Metric Analytics: A Practical Guide with Novel Ideas. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom) (KDD ’18). Association for Computing Machinery, New York, NY, USA, 233–242. https://doi.org/10.1145/3219819.3219919
- Trustworthy Analysis of Online A/B Tests: Pitfalls, challenges and solutions. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (Cambridge, United Kingdom) (WSDM ’17). Association for Computing Machinery, New York, NY, USA, 641–649. https://doi.org/10.1145/3018661.3018677
- Aaron Gokaslan and Vanya Cohen. 2019. OpenWebText Corpus. http://Skylion007.github.io/OpenWebTextCorpus.
- Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. arXiv:1706.02677 [cs.CV]
- New and improved embedding model. https://openai.com/blog/new-and-improved-embedding-model
- Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 6769–6781.
- Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics 7 (2019), 453–466.
- Lost in the Middle: How Language Models Use Long Contexts. arXiv:2307.03172 [cs.CL]
- Automatic differentiation in PyTorch. (2017).
- Haystack: the end-to-end NLP framework for pragmatic builders. https://github.com/deepset-ai/haystack.
- NeMo Guardrails: A Toolkit for Controllable and Safe LLM Applications with Programmable Rails. arXiv:2310.10501 [cs.CL]
- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. http://arxiv.org/abs/1908.10084
- The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval 3, 4 (2009), 333–389.
- The effect of back-formulating questions in question answering evaluation. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. 474–475.
- MPNet: Masked and Permuted Pre-training for Language Understanding. arXiv:2004.09297 [cs.CL]
- Overlapping Experiment Infrastructure: More, Better, Faster Experimentation. In Proceedings 16th Conference on Knowledge Discovery and Data Mining. Washington, DC, 17–26.
- Large language models can accurately predict searcher preferences. arXiv:2309.10621 [cs.IR]
- Kaitlin Wowak. [n. d.]. Humans vs. automation: Service center agents can outperform technology, study shows. https://news.nd.edu/news/humans-vs-automation-service-center-agents-can-outperform-technology-study-shows/
- C-Pack: Packaged Resources To Advance General Chinese Embedding. arXiv:2309.07597 [cs.CL]
- Critically examining the ”neural hype”: weak baselines and the additivity of effectiveness gains from neural ranking models. In Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. 1129–1132.
- WikiQA: A challenge dataset for open-domain question answering. In Proceedings of the 2015 conference on empirical methods in natural language processing. 2013–2018.
- JudgeLM: Fine-tuned Large Language Models are Scalable Judges. arXiv:2310.17631 [cs.CL]
- Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In Proceedings of the IEEE international conference on computer vision. 19–27.