pooling#

(s3prl.nn.pooling)

Common pooling methods

Authors:
  • Leo 2022

  • Haibin Wu 2022

MeanPooling#

class s3prl.nn.pooling.MeanPooling(input_size: int)[source][source]#

Bases: Module

Computes Temporal Average Pooling (MeanPooling over time) Module

property input_size: int[source]#
property output_size: int[source]#
forward(xs: Tensor, xs_len: LongTensor)[source][source]#
Parameters:
  • xs (torch.Tensor) – Input tensor (#batch, frames, input_size).

  • xs_len (torch.LongTensor) – with the lengths for each sample

Returns:

Output tensor (#batch, input_size)

Return type:

torch.Tensor

call_super_init: bool = False[source]#
dump_patches: bool = False[source]#
training: bool[source]#

TemporalAveragePooling#

s3prl.nn.pooling.TemporalAveragePooling[source]#

alias of MeanPooling

TemporalStatisticsPooling#

class s3prl.nn.pooling.TemporalStatisticsPooling(input_size: int)[source][source]#

Bases: Module

Paper: X-vectors: Robust DNN Embeddings for Speaker Recognition Link: http://www.danielpovey.com/files/2018_icassp_xvectors.pdf

property input_size: int[source]#
property output_size: int[source]#
forward(xs, xs_len)[source][source]#

Computes Temporal Statistics Pooling Module

Parameters:
  • xs (torch.Tensor) – Input tensor (#batch, frames, input_size).

  • xs_len (torch.LongTensor) – with the lengths for each sample

Returns:

Output tensor (#batch, output_size)

Return type:

torch.Tensor

call_super_init: bool = False[source]#
dump_patches: bool = False[source]#
training: bool[source]#

SelfAttentivePooling#

class s3prl.nn.pooling.SelfAttentivePooling(input_size: int)[source][source]#

Bases: Module

Paper: Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification Link: https://danielpovey.com/files/2018_interspeech_xvector_attention.pdf

property input_size: int[source]#
property output_size: int[source]#
forward(xs, xs_len)[source][source]#

Computes Self-Attentive Pooling Module

Parameters:
  • xs (torch.Tensor) – Input tensor (#batch, frames, input_size).

  • xs_len (torch.LongTensor) – with the lengths for each sample

Returns:

Output tensor (#batch, input_size)

Return type:

torch.Tensor

call_super_init: bool = False[source]#
dump_patches: bool = False[source]#
training: bool[source]#

AttentiveStatisticsPooling#

class s3prl.nn.pooling.AttentiveStatisticsPooling(input_size: int)[source][source]#

Bases: Module

Paper: Attentive Statistics Pooling for Deep Speaker Embedding Link: https://arxiv.org/pdf/1803.10963.pdf

property input_size: int[source]#
property output_size: int[source]#
forward(xs, xs_len)[source][source]#

Computes Attentive Statistics Pooling Module

Parameters:
  • xs (torch.Tensor) – Input tensor (#batch, frames, input_size).

  • xs_len (torch.LongTensor) – with the lengths for each sample

Returns:

Output tensor (#batch, input_size)

Return type:

torch.Tensor

call_super_init: bool = False[source]#
dump_patches: bool = False[source]#
training: bool[source]#