Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Detecting Extraneous Content in Podcasts (2103.02585v1)

Published 3 Mar 2021 in cs.CL

Abstract: Podcast episodes often contain material extraneous to the main content, such as advertisements, interleaved within the audio and the written descriptions. We present classifiers that leverage both textual and listening patterns in order to detect such content in podcast descriptions and audio transcripts. We demonstrate that our models are effective by evaluating them on the downstream task of podcast summarization and show that we can substantively improve ROUGE scores and reduce the extraneous content generated in the summaries.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Sravana Reddy (8 papers)
  2. Yongze Yu (8 papers)
  3. Aasish Pappu (11 papers)
  4. Aswin Sivaraman (13 papers)
  5. Rezvaneh Rezapour (19 papers)
  6. Rosie Jones (13 papers)
Citations (17)