Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Federated Representation Learning for Automatic Speech Recognition (2308.02013v2)

Published 3 Aug 2023 in cs.SD, cs.CL, cs.LG, and eess.AS

Abstract: Federated Learning (FL) is a privacy-preserving paradigm, allowing edge devices to learn collaboratively without sharing data. Edge devices like Alexa and Siri are prospective sources of unlabeled audio data that can be tapped to learn robust audio representations. In this work, we bring Self-supervised Learning (SSL) and FL together to learn representations for Automatic Speech Recognition respecting data privacy constraints. We use the speaker and chapter information in the unlabeled speech dataset, Libri-Light, to simulate non-IID speaker-siloed data distributions and pre-train an LSTM encoder with the Contrastive Predictive Coding framework with FedSGD. We show that the pre-trained ASR encoder in FL performs as well as a centrally pre-trained model and produces an improvement of 12-15% (WER) compared to no pre-training. We further adapt the federated pre-trained models to a new language, French, and show a 20% (WER) improvement over no pre-training.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Guruprasad V Ramesh (3 papers)
  2. Gopinath Chennupati (20 papers)
  3. Milind Rao (13 papers)
  4. Anit Kumar Sahu (35 papers)
  5. Ariya Rastrow (55 papers)
  6. Jasha Droppo (24 papers)

Summary

We haven't generated a summary for this paper yet.