mbrs.decoders.pruning_mbr module

mbrs.decoders.pruning_mbr module#

class mbrs.decoders.pruning_mbr.DecoderPruningMBR(cfg: ~mbrs.decoders.pruning_mbr.DecoderPruningMBR.Config, metric: ~mbrs.metrics.base.Metric, selector: ~mbrs.selectors.base.Selector = <mbrs.selectors.nbest.SelectorNbest object>)[source]#

Bases: DecoderMBR

Pruning MBR decoder class.

References

J. Cheng and A. Vlachos, 2023, “Faster Minimum Bayes Risk Decoding with Confidence-based Pruning”. https://aclanthology.org/2023.emnlp-main.767/

class Config(alpha: float = 0.99, sampling_scheduler: list[int] = <factory>, num_bootstrap_samples: int = 500, seed: int = 0)[source]#

Bases: Config

Configuration for the decoder.

alpha (float): Prune hypotheses based on this confidence threshold.
sampling_shceduler (list[int]): Sample size scheduler. For each step, the number of samples will be the t-th number.
num_boostrap_samples (int): Number of boostrap samples.
seed (int): Random seed for bootstrap sampling.

alpha: float = 0.99#

num_bootstrap_samples: int = 500#

sampling_scheduler: list[int]#

seed: int = 0#

cfg: Config#

decode(hypotheses: list[str], references: list[str], source: str | None = None, nbest: int = 1, reference_lprobs: Tensor | None = None) → Output[source]#

Select the n-best hypotheses based on the strategy.

Parameters:

hypotheses (list[str]) – Hypotheses.
references (list[str]) – References.
source (str, optional) – A source.
nbest (int) – Return the n-best hypotheses.
reference_lprobs (Tensor, optional) – Log-probabilities for each reference sample. The shape must be (len(references),). See https://arxiv.org/abs/2311.05263.

Returns:

The n-best hypotheses.

Return type:

DecoderMBR.Output

decode_pruning(hypotheses: list[str], references: list[str], source: str | None = None, nbest: int = 1, reference_lprobs: Tensor | None = None) → tuple[list[float], list[int]][source]#

Select the n-best hypotheses using pruning MBR decoding.

Parameters:

hypotheses (list[str]) – Hypotheses.
references (list[str]) – References.
source (str, optional) – A source.
nbest (int) – Return the n-best hypotheses.
reference_lprobs (Tensor, optional) – Log-probabilities for each reference sample. The shape must be (len(references),). See https://arxiv.org/abs/2311.05263.

Returns:

Top-k scores. - list[int]: Top-k indices.

Return type:

list[float]

metric: Metric#

mbrs.decoders.pruning_mbr module

Contents

mbrs.decoders.pruning_mbr module#