Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention (2204.11180v1)

Published 24 Apr 2022 in eess.AS and cs.SD

Abstract: Although few-shot learning has attracted much attention from the fields of image and audio classification, few efforts have been made on few-shot speaker identification. In the task of few-shot learning, overfitting is a tough problem mainly due to the mismatch between training and testing conditions. In this paper, we propose a few-shot speaker identification method which can alleviate the overfitting problem. In the proposed method, the model of a depthwise separable convolutional network with channel attention is trained with a prototypical loss function. Experimental datasets are extracted from three public speech corpora: Aishell-2, VoxCeleb1 and TORGO. Experimental results show that the proposed method exceeds state-of-the-art methods for few-shot speaker identification in terms of accuracy and F-score.

Authors (6)

Yanxiong Li (18 papers)
Wucheng Wang (2 papers)
Hao Chen (1007 papers)
Wenchang Cao (9 papers)
Wei Li (1123 papers)
Qianhua He (10 papers)

Citations (4)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention (2204.11180v1)

Summary

Related Papers