Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring a Unified Attention-Based Pooling Framework for Speaker Verification (1808.07120v1)

Published 21 Aug 2018 in cs.SD and eess.AS

Abstract: The pooling layer is an essential component in the neural network based speaker verification. Most of the current networks in speaker verification use average pooling to derive the utterance-level speaker representations. Average pooling takes every frame as equally important, which is suboptimal since the speaker-discriminant power is different between speech segments. In this paper, we present a unified attention-based pooling framework and combine it with the multi-head attention. Experiments on the Fisher and NIST SRE 2010 dataset show that involving outputs from lower layers to compute the attention weights can outperform average pooling and achieve better results than vanilla attention method. The multi-head attention further improves the performance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yi Liu (543 papers)
  2. Liang He (202 papers)
  3. Weiwei Liu (51 papers)
  4. Jia Liu (369 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.