Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multilingual and Cross-Lingual Intent Detection from Spoken Data (2104.08524v1)

Published 17 Apr 2021 in cs.CL

Abstract: We present a systematic study on multilingual and cross-lingual intent detection from spoken data. The study leverages a new resource put forth in this work, termed MInDS-14, a first training and evaluation resource for the intent detection task with spoken data. It covers 14 intents extracted from a commercial system in the e-banking domain, associated with spoken examples in 14 diverse language varieties. Our key results indicate that combining machine translation models with state-of-the-art multilingual sentence encoders (e.g., LaBSE) can yield strong intent detectors in the majority of target languages covered in MInDS-14, and offer comparative analyses across different axes: e.g., zero-shot versus few-shot learning, translation direction, and impact of speech recognition. We see this work as an important step towards more inclusive development and evaluation of multilingual intent detectors from spoken data, in a much wider spectrum of languages compared to prior work.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Daniela Gerz (11 papers)
  2. Pei-Hao Su (25 papers)
  3. Razvan Kusztos (1 paper)
  4. Avishek Mondal (1 paper)
  5. Michał Lis (2 papers)
  6. Eshan Singhal (1 paper)
  7. Nikola Mrkšić (30 papers)
  8. Tsung-Hsien Wen (27 papers)
  9. Ivan Vulić (130 papers)
Citations (28)