Papers
Topics
Authors
Recent
Search
2000 character limit reached

Max-margin Metric Learning for Speaker Recognition

Published 20 Oct 2015 in cs.SD and cs.LG | (1510.05940v2)

Abstract: Probabilistic linear discriminant analysis (PLDA) is a popular normalization approach for the i-vector model, and has delivered state-of-the-art performance in speaker recognition. A potential problem of the PLDA model, however, is that it essentially assumes Gaussian distributions over speaker vectors, which is not always true in practice. Additionally, the objective function is not directly related to the goal of the task, e.g., discriminating true speakers and imposters. In this paper, we propose a max-margin metric learning approach to solve the problems. It learns a linear transform with a criterion that the margin between target and imposter trials are maximized. Experiments conducted on the SRE08 core test show that compared to PLDA, the new approach can obtain comparable or even better performance, though the scoring is simply a cosine computation.

Citations (12)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.