Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention (2204.11180v1)

Published 24 Apr 2022 in eess.AS and cs.SD

Abstract: Although few-shot learning has attracted much attention from the fields of image and audio classification, few efforts have been made on few-shot speaker identification. In the task of few-shot learning, overfitting is a tough problem mainly due to the mismatch between training and testing conditions. In this paper, we propose a few-shot speaker identification method which can alleviate the overfitting problem. In the proposed method, the model of a depthwise separable convolutional network with channel attention is trained with a prototypical loss function. Experimental datasets are extracted from three public speech corpora: Aishell-2, VoxCeleb1 and TORGO. Experimental results show that the proposed method exceeds state-of-the-art methods for few-shot speaker identification in terms of accuracy and F-score.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yanxiong Li (18 papers)
  2. Wucheng Wang (2 papers)
  3. Hao Chen (1007 papers)
  4. Wenchang Cao (9 papers)
  5. Wei Li (1123 papers)
  6. Qianhua He (10 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.