Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Modeling Real-Time Interactive Conversations as Timed Diarized Transcripts (2405.13203v1)

Published 21 May 2024 in cs.LG and cs.CL

Abstract: Chatbots built upon LLMs have exploded in popularity, but they have largely been limited to synchronous, turn-by-turn dialogues. In this paper we present a simple yet general method to simulate real-time interactive conversations using pretrained text-only LLMs, by modeling timed diarized transcripts and decoding them with causal rejection sampling. We demonstrate the promise of this method with two case studies: instant messenger dialogues and spoken conversations, which require generation at about 30 tok/s and 20 tok/s respectively to maintain real-time interactivity. These capabilities can be added into LLMs using relatively little data and run on commodity hardware.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Garrett Tanzer (11 papers)
  2. Gustaf Ahdritz (5 papers)
  3. Luke Melas-Kyriazi (22 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com