mbrs.decoders package#
Submodules#
Module contents#
- class mbrs.decoders.DecoderAggregateMBR(cfg: ~mbrs.decoders.base.DecoderBase.Config, metric: ~mbrs.metrics.base.MetricBase, selector: ~mbrs.selectors.base.Selector = <mbrs.selectors.nbest.SelectorNbest object>)[source]#
Bases:
DecoderMBRMBR decoding with reference aggregation.
Time complexity: O(N)
Space complexity: O(N)
References
J. DeNero et al., 2009, “Fast Consensus Decoding over Translation Forests”. https://aclanthology.org/P09-1064/
J. Vamvas and R. Sennrich, 2024, “Linear-time Minimum Bayes Risk Decoding with Reference Aggregation”. https://arxiv.org/abs/2402.04251
- decode(hypotheses: list[str], references: list[str], source: str | None = None, nbest: int = 1, reference_lprobs: Tensor | None = None) Output[source]#
Select the n-best hypotheses based on the strategy.
- Parameters:
- Returns:
The n-best hypotheses.
- Return type:
DecoderAggregateMBR.Output
- class mbrs.decoders.DecoderBase(cfg: ~mbrs.decoders.base.DecoderBase.Config, metric: ~mbrs.metrics.base.MetricBase, selector: ~mbrs.selectors.base.Selector = <mbrs.selectors.nbest.SelectorNbest object>)[source]#
Bases:
ABCDecoder base class.
- class Output(idx: list[int], sentence: list[str], score: list[float])[source]#
Bases:
objectidx (list[int]): Index numbers of the n-best hypotheses.
sentence (list[str]): Sentences of the n-best hypotheses.
score (list[float]): Scores of the n-best hypotheses.
- argbest(x: Tensor) Tensor[source]#
Return the index of the best element.
- Parameters:
x (Tensor) – Input 1-D array.
- Returns:
A scalar tensor of the best index.
- Return type:
Tensor
- select(hypotheses: list[str], expected_scores: Tensor, nbest: int = 1, source: str | None = None, **kwargs) Output[source]#
Select the final output list.
- Parameters:
- Returns:
Outputs.
- Return type:
- topk(x: Tensor, k: int = 1) tuple[list[float], list[int]][source]#
Return the top-k best elements and corresponding indices.
- Parameters:
x (Tensor) – Input 1-D array.
k (int) – Return the top-k values and indices.
- Returns:
- tuple[list[float], list[int]]
list[float]: The top-k values.
list[int]: The top-k indices.
- class mbrs.decoders.DecoderCentroidMBR(cfg: ~mbrs.decoders.centroid_mbr.DecoderCentroidMBR.Config, metric: ~mbrs.metrics.base.MetricAggregatableCache, selector: ~mbrs.selectors.base.Selector = <mbrs.selectors.nbest.SelectorNbest object>)[source]#
Bases:
DecoderMBRCentroid-Based MBR decoder class.
Time complexity: O(Nk)
Space complexity: O(Nk)
where k << N.
References
H. Deguchi et al., 2024. “Centroid-Based Efficient Minimum Bayes Risk Decoding”. https://aclanthology.org/2024.findings-acl.654
- class Config(kmeans: ~mbrs.modules.kmeans.Kmeans.Config = <factory>, count_weight: bool = False)[source]#
Bases:
ConfigConfiguration for the decoder.
kmeans (Kmeans.Config): Configuration for k-means.
count_weight: (bool) Weight the scores with counts.
- decode(hypotheses: list[str], references: list[str], source: str | None = None, nbest: int = 1, reference_lprobs: Tensor | None = None) Output[source]#
Select the n-best hypotheses based on the strategy.
- Parameters:
- Returns:
The n-best hypotheses.
- Return type:
DecoderCentroidMBR.Output
- class mbrs.decoders.DecoderMBR(cfg: ~mbrs.decoders.base.DecoderBase.Config, metric: ~mbrs.metrics.base.MetricBase, selector: ~mbrs.selectors.base.Selector = <mbrs.selectors.nbest.SelectorNbest object>)[source]#
Bases:
DecoderReferenceBasedNaive MBR decoder class.
Time complexity: O(N^2)
Space complexity: O(N^2)
References
S. Kumar and W. Byrne, 2004, “Minimum Bayes-Risk Decoding for Statistical Machine Translation”. https://aclanthology.org/N04-1022
B. Eikema and W. Aziz, 2020, “Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine Translation”. https://aclanthology.org/2020.coling-main.398
- decode(hypotheses: list[str], references: list[str], source: str | None = None, nbest: int = 1, reference_lprobs: Tensor | None = None) Output[source]#
Select the n-best hypotheses based on the strategy.
- Parameters:
- Returns:
The n-best hypotheses.
- Return type:
DecoderMBR.Output
- class mbrs.decoders.DecoderProbabilisticMBR(cfg: ~mbrs.decoders.base.DecoderBase.Config, metric: ~mbrs.metrics.base.MetricBase, selector: ~mbrs.selectors.base.Selector = <mbrs.selectors.nbest.SelectorNbest object>)[source]#
Bases:
DecoderMBRProbabilistic MBR decoder using alternating least squares (ALS) approximation.
References
F. Trabelsi et al., 2024, “Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms”. https://arxiv.org/abs/2406.02832
- class Config(reduction_factor: float = 8.0, regularization_weight: float = 0.1, rank: int = 8, niter: int = 10, seed: int = 0)[source]#
Bases:
ConfigConfiguration for the decoder.
reduction_factor (float): Reduction factor. The computational budget will be reduced to 1 / reduction_factor.
regularization_weight (float): Weight of L2 regularization.
rank (int): Rank of the factarized matrices.
niter (int): The number of alternating steps performed.
seed (int): Random seed.
- decode(hypotheses: list[str], references: list[str], source: str | None = None, nbest: int = 1, reference_lprobs: Tensor | None = None) Output[source]#
Select the n-best hypotheses based on the strategy.
- Parameters:
- Returns:
The n-best hypotheses.
- Return type:
DecoderMBR.Output
- class mbrs.decoders.DecoderPruningMBR(cfg: ~mbrs.decoders.pruning_mbr.DecoderPruningMBR.Config, metric: ~mbrs.metrics.base.Metric, selector: ~mbrs.selectors.base.Selector = <mbrs.selectors.nbest.SelectorNbest object>)[source]#
Bases:
DecoderMBRPruning MBR decoder class.
References
J. Cheng and A. Vlachos, 2023, “Faster Minimum Bayes Risk Decoding with Confidence-based Pruning”. https://aclanthology.org/2023.emnlp-main.767/
- class Config(alpha: float = 0.99, sampling_scheduler: list[int] = <factory>, num_bootstrap_samples: int = 500, seed: int = 0)[source]#
Bases:
ConfigConfiguration for the decoder.
alpha (float): Prune hypotheses based on this confidence threshold.
sampling_shceduler (list[int]): Sample size scheduler. For each step, the number of samples will be the t-th number.
num_boostrap_samples (int): Number of boostrap samples.
seed (int): Random seed for bootstrap sampling.
- decode(hypotheses: list[str], references: list[str], source: str | None = None, nbest: int = 1, reference_lprobs: Tensor | None = None) Output[source]#
Select the n-best hypotheses based on the strategy.
- Parameters:
- Returns:
The n-best hypotheses.
- Return type:
DecoderMBR.Output
- decode_pruning(hypotheses: list[str], references: list[str], source: str | None = None, nbest: int = 1, reference_lprobs: Tensor | None = None) tuple[list[float], list[int]][source]#
Select the n-best hypotheses using pruning MBR decoding.
- Parameters:
- Returns:
Top-k scores. - list[int]: Top-k indices.
- Return type:
list[float]
- class mbrs.decoders.DecoderReferenceBased(cfg: ~mbrs.decoders.base.DecoderBase.Config, metric: ~mbrs.metrics.base.MetricBase, selector: ~mbrs.selectors.base.Selector = <mbrs.selectors.nbest.SelectorNbest object>)[source]#
Bases:
DecoderBaseDecoder base class for strategies that use references like MBR decoding.
- abstract decode(hypotheses: list[str], references: list[str], source: str | None = None, nbest: int = 1, reference_lprobs: Tensor | None = None) Output[source]#
Select the n-best hypotheses based on the strategy.
- Parameters:
- Returns:
The n-best hypotheses.
- Return type:
Decoder.Output
- class mbrs.decoders.DecoderReferenceless(cfg: ~mbrs.decoders.base.DecoderBase.Config, metric: ~mbrs.metrics.base.MetricBase, selector: ~mbrs.selectors.base.Selector = <mbrs.selectors.nbest.SelectorNbest object>)[source]#
Bases:
DecoderBaseDecoder base class for reference-free strategies.
- abstract decode(hypotheses: list[str], source: str, nbest: int = 1) Output[source]#
Select the n-best hypotheses based on the strategy.
- metric: MetricReferenceless#
- class mbrs.decoders.DecoderRerank(cfg: ~mbrs.decoders.base.DecoderBase.Config, metric: ~mbrs.metrics.base.MetricBase, selector: ~mbrs.selectors.base.Selector = <mbrs.selectors.nbest.SelectorNbest object>)[source]#
Bases:
DecoderReferencelessReranking decoder class.
Time complexity: O(N)
Space complexity: O(N)