S3PRL#
S3PRL is a toolkit targeting for Self-Supervised Learning for speech processing. Its full name is Self-Supervised Speech Pre-training and Representation Learning. It supports the following three major features:
Pre-training
You can train the following models from scratch:
Mockingjay, Audio ALBERT, TERA, APC, VQ-APC, NPC, and DistilHuBERT
Pre-trained models (Upstream) collection
Easily load most of the existing upstream models with pretrained weights in a unified I/O interface.
Pretrained models are registered through torch.hub, which means you can use these models in your own project by one-line plug-and-play without depending on this toolkit’s coding style.
Downstream Evaluation
Utilize upstream models in lots of downstream tasks
The official implementation of the SUPERB Benchmark
Getting Started#
- Install S3PRL
- S3PRL Upstream Collection
- SSL Method
- Mockingjay
- TERA
- Audio ALBERT
- APC
- VQ-APC
- NPC
- PASE+
- Modified CPC
- DeCoAR
- DeCoAR 2.0
- wav2vec
- vq-wav2vec
- Discrete BERT
- wav2vec 2.0
- XLS-R
- HuBERT
- ESPnetHuBERT
- WavLabLM
- Multiresolution HuBERT (MR-HuBERT)
- DistilHuBERT
- HuBERT-MGR
- Unispeech-SAT
- WavLM
- data2vec
- AST
- SSAST
- MAE-AST
- Byol-A
- Byol-S
- VGGish
- PaSST
- Use Problem module to run customizable recipes
How to Contribute#
API Documentation#
Common model and loss in pure torch.nn.Module with torch dependency only |
|
Pre-defined python recipes with customizable methods |
|
Define how a model is trained & evaluated for each step in the train/valid/test loop |
|
This package handles data-related sub-tasks |
|
Evaluation metrics |
|
Handy tools |