metric#

(s3prl.metric)

Evaluation metrics

s3prl.metric.common

Commonly used metrics

s3prl.metric.diarization

Metrics for diarization

s3prl.metric.slot_filling

Metrics for the slot filling SLU task

accuracy#

s3prl.metric.accuracy(xs, ys, item_same_fn=None)[source][source]#

ter#

s3prl.metric.ter(hyps: List[Union[str, List[str]]], refs: List[Union[str, List[str]]]) float[source][source]#

Token error rate calculator.

Parameters:
  • hyps (List[Union[str, List[str]]]) – List of hypotheses.

  • refs (List[Union[str, List[str]]]) – List of references.

Returns:

Averaged token error rate overall utterances.

Return type:

float

wer#

s3prl.metric.wer(hyps: List[str], refs: List[str]) float[source][source]#

Word error rate calculator.

Parameters:
  • hyps (List[str]) – List of hypotheses.

  • refs (List[str]) – List of references.

Returns:

Averaged word error rate overall utterances.

Return type:

float

per#

s3prl.metric.per(hyps: List[str], refs: List[str]) float[source][source]#

Phoneme error rate calculator.

Parameters:
  • hyps (List[str]) – List of hypotheses.

  • refs (List[str]) – List of references.

Returns:

Averaged phoneme error rate overall utterances.

Return type:

float

cer#

s3prl.metric.cer(hyps: List[str], refs: List[str]) float[source][source]#

Character error rate calculator.

Parameters:
  • hyps (List[str]) – List of hypotheses.

  • refs (List[str]) – List of references.

Returns:

Averaged character error rate overall utterances.

Return type:

float

compute_eer#

s3prl.metric.compute_eer(labels: List[int], scores: List[float])[source][source]#

Compute equal error rate.

Parameters:
  • scores (List[float]) – List of hypotheses.

  • labels (List[int]) – List of references.

Returns:

Equal error rate. treshold (float): The treshold to accept a target trial.

Return type:

eer (float)

compute_minDCF#

s3prl.metric.compute_minDCF(labels: List[int], scores: List[float], p_target: float = 0.01, c_miss: int = 1, c_fa: int = 1)[source][source]#

Compute MinDCF. Computes the minimum of the detection cost function. The comments refer to equations in Section 3 of the NIST 2016 Speaker Recognition Evaluation Plan.

Parameters:
  • scores (List[float]) – List of hypotheses.

  • labels (List[int]) – List of references.

  • p (float) – The prior probability of positive class.

  • c_miss (int) – The cost of miss.

  • c_fa (int) – The cost of false alarm.

Returns:

The calculated min_dcf. min_c_det_threshold (float): The treshold to calculate min_dcf.

Return type:

min_dcf (float)

calc_diarization_error#

s3prl.metric.calc_diarization_error(pred, label, length)[source][source]#

slot_edit_f1#

s3prl.metric.slot_edit_f1(hypothesis: List[str], groundtruth: List[str], loop_over_all_slot: bool, **kwargs) float[source][source]#

slot_value_cer#

s3prl.metric.slot_value_cer(hypothesis: List[str], groundtruth: List[str], **kwargs) float[source][source]#

slot_value_wer#

s3prl.metric.slot_value_wer(hypothesis: List[str], groundtruth: List[str], **kwargs) float[source][source]#

slot_type_f1#

s3prl.metric.slot_type_f1(hypothesis: List[str], groundtruth: List[str], **kwargs) float[source][source]#