Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding (2211.05869v1)

Published 10 Nov 2022 in cs.CL, cs.SD, and eess.AS

Abstract: Collecting sufficient labeled data for spoken language understanding (SLU) is expensive and time-consuming. Recent studies achieved promising results by using pre-trained models in low-resource scenarios. Inspired by this, we aim to ask: which (if any) pre-training strategies can improve performance across SLU benchmarks? To answer this question, we employ four types of pre-trained models and their combinations for SLU. We leverage self-supervised speech and LLMs (LM) pre-trained on large quantities of unpaired data to extract strong speech and text representations. We also explore using supervised models pre-trained on larger external automatic speech recognition (ASR) or SLU corpora. We conduct extensive experiments on the SLU Evaluation (SLUE) benchmark and observe self-supervised pre-trained models to be more powerful, with pre-trained LM and speech models being most beneficial for the Sentiment Analysis and Named Entity Recognition task, respectively.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Yifan Peng (147 papers)
  2. Siddhant Arora (50 papers)
  3. Yosuke Higuchi (23 papers)
  4. Yushi Ueda (7 papers)
  5. Sujay Kumar (2 papers)
  6. Karthik Ganesan (9 papers)
  7. Siddharth Dalmia (36 papers)
  8. Xuankai Chang (61 papers)
  9. Shinji Watanabe (416 papers)
Citations (17)

Summary

We haven't generated a summary for this paper yet.