hear#

(s3prl.nn.hear)

The probing model following Hear Benchmark

Authors:

Hear Team 2021
Leo 2022

HearFullyConnectedPrediction#

class s3prl.nn.hear.HearFullyConnectedPrediction(input_size: int, output_size: int, hidden_size: int = 1024, hidden_layers: int = 2, norm_after_activation: bool = False, dropout: float = 0.1, initialization: str = 'xavier_uniform_', hidden_norm: str = 'BatchNorm1d', pooling_type: str = None, pooling_conf: dict = None)[source][source]#

Bases: Module

The specific prediction head used in the Hear Benchmark. Modified from: https://github.com/hearbenchmark/hear-eval-kit/blob/855964977238e89dfc76394aa11c37010edb6f20/heareval/predictions/task_predictions.py#L142

Parameters:

input_size (int) – input_size
output_size (int) – output_size
hidden_size (int) – hidden size across all layers. Default: 1024
hidden_layers (int) – number of hidden layers, all in hidden_size. Default: 2
norm_after_activation (bool) – whether to norm after activation. Default: False
dropout (float) – dropout ratio. Default: 0.1
initialization (str) – initialization method name available in torch.nn.init
hidden_norm (str) – normalization method name available in torch.nn
pooling_type (str) – the pooling class name in s3prl.nn.pooling. Default: MeanPooling
pooling_conf (dict) – the arguments for initializing the pooling class. Default: empty dict

property input_size: int[source]#

property output_size: int[source]#

forward(x, x_len) → Tensor[source][source]#

Parameters:

x (torch.FloatTensor) – (batch_size, seq_len, input_size)
x_len (torch.LongTensor) – (batch_size, )

Returns:

y (torch.FloatTensor)
y_len (torch.LongTensor)

if pooling_type is None, y is (batch_size, seq_len, output_size) and y_len is (batch_size, ) if not None, y is (batch_size, output_size) and y_len is (batch_size, ) in all 1s.

Return type:

tuple