mbrs.metrics.chrf module#

class mbrs.metrics.chrf.MetricChrF(cfg: Config)[source]#

Bases: MetricAggregatable

ChrF metric class.

class AggregatedReference(ngrams: list[Counter])[source]#

Bases: object

Aggregated reference representation.

  • ngrams (list[Counter]]): Bags of n-grams for each order.

ngrams: list[Counter]#
class Config(char_order: int = 6, word_order: int = 0, beta: int = 2, lowercase: bool = False, whitespace: bool = False, eps_smoothing: bool = False, num_workers: int = 8, fastchrf: bool = False)[source]#

Bases: Config

ChrF metric configuration.

  • char_order (int): Character n-gram order.

  • word_order (int): Word n-gram order. If equals to 2, the metric is referred to as chrF++.

  • beta (int): Determine the importance of recall w.r.t precision.

  • lowercase (bool): Enable case-insensitivity.

  • whitespace (bool): If True, include whitespaces when extracting character n-grams.

  • eps_smoothing (bool): If True, applies epsilon smoothing similar to reference chrF++.py, NLTK and Moses implementations.

    Otherwise, it takes into account effective match order similar to sacreBLEU < 2.0.0.

  • num_workers (int): Number of workers for multiprocessing.

  • fastchrf (bool): Use the rust implementation of chrF.

beta: int = 2#
char_order: int = 6#
eps_smoothing: bool = False#
fastchrf: bool = False#
lowercase: bool = False#
num_workers: int = 8#
whitespace: bool = False#
word_order: int = 0#
cfg: Config#
corpus_score(hypotheses: list[str], references_lists: list[list[str]], sources: list[str] | None = None) float[source]#

Calculate the corpus-level score.

Parameters:
  • hypotheses (list[str]) – Hypotheses.

  • references_lists (list[list[str]]) – Lists of references.

  • sources (list[str], optional) – Sources.

Returns:

The corpus score.

Return type:

float

expected_scores_reference_aggregation(hypotheses: list[str], references: list[str], source: str | None = None, reference_lprobs: Tensor | None = None) Tensor[source]#

Calculate the expected scores for each hypothesis.

Parameters:
  • hypotheses (list[str]) – Hypotheses.

  • references (list[str]) – References.

  • source (str, optional) – A source.

  • reference_lprobs (Tensor, optional) – Log-probabilities for each reference sample. The shape must be (len(references),). See https://arxiv.org/abs/2311.05263.

Returns:

The expected scores for each hypothesis.

Return type:

Tensor

pairwise_scores(hypotheses: list[str], references: list[str], *_, **__) Tensor[source]#

Calculate the pairwise scores.

Parameters:
  • hypotheses (list[str]) – Hypotheses.

  • references (list[str]) – References.

Returns:

Score matrix of shape (H, R), where H is the number

of hypotheses and R is the number of references.

Return type:

Tensor

score(hypothesis: str, reference: str, *_, **__) float[source]#

Calculate the score of the given hypothesis.

Parameters:
  • hypothesis (str) – Hypothesis.

  • reference (str) – Reference.

Returns:

The score of the given hypothesis.

Return type:

float

scores(hypotheses: list[str], references: list[str], *_, **__) Tensor[source]#

Calculate the scores of the given hypotheses.

Parameters:
  • hypotheses (list[str]) – N hypotheses.

  • references (list[str]) – N references.

Returns:

The N scores of the given hypotheses.

Return type:

Tensor