Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer (2307.13343v1)

Published 25 Jul 2023 in eess.AS, cs.CR, and cs.SD

Abstract: Smart devices serviced by large-scale AI models necessitates user data transfer to the cloud for inference. For speech applications, this means transferring private user information, e.g., speaker identity. Our paper proposes a privacy-enhancing framework that targets speaker identity anonymization while preserving speech recognition accuracy for our downstream task~-~Automatic Speech Recognition (ASR). The proposed framework attaches flexible gradient reversal based speaker adversarial layers to target layers within an ASR model, where speaker adversarial training anonymizes acoustic embeddings generated by the targeted layers to remove speaker identity. We propose on-device deployment by execution of initial layers of the ASR model, and transmitting anonymized embeddings to the cloud, where the rest of the model is executed while preserving privacy. Experimental results show that our method efficiently reduces speaker recognition relative accuracy by 33%, and improves ASR performance by achieving 6.2% relative Word Error Rate (WER) reduction.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Md Asif Jalal (13 papers)
  2. Pablo Peso Parada (10 papers)
  3. Jisi Zhang (9 papers)
  4. Karthikeyan Saravanan (10 papers)
  5. Mete Ozay (65 papers)
  6. Myoungji Han (3 papers)
  7. Jung In Lee (2 papers)
  8. Seokyeong Jung (6 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.