mbrs.metrics.bertscore module#
- class mbrs.metrics.bertscore.BERTScoreScoreType(value)[source]#
-
An enumeration.
- f1 = 2#
- precision = 0#
- recall = 1#
- class mbrs.metrics.bertscore.MetricBERTScore(cfg: Config)[source]#
Bases:
MetricCacheableBERTScore metric class.
- class Cache(embeddings: list[Tensor], idf_weights: list[Tensor])[source]#
Bases:
CacheIntermediate representations of sentences.
- embeddings (list[Tensor]): A list of token embeddings of shape (T, D),
where T is the length of sequence, and D is a size of the embedding.
idf_weights (list[Tensor]): A list of IDF weights of shape (T,).
- class Config(score_type: BERTScoreScoreType = BERTScoreScoreType.f1, model_type: str | None = None, num_layers: int | None = None, batch_size: int = 64, nthreads: int = 4, idf: bool = False, idf_sents: list[str] | None = None, lang: str | None = None, rescale_with_baseline: bool = False, baseline_path: str | None = None, use_fast_tokenizer: bool = False, fp16: bool = False, bf16: bool = False, cpu: bool = False)[source]#
Bases:
ConfigBERTScore metric configuration.
- score_type (BERTScoreScoreType): The output score type, i.e.,
precision, recall, or f1.
- model_type (str): Contexual embedding model specification, default using the
suggested model for the target langauge; has to specify at least one of model_type or lang.
- num_layers (int): The layer of representation to use. Default using the number
of layer tuned on WMT16 correlation data.
- idf (bool): A booling to specify whether to use idf or not. (This should be
True even if idf_sents is given.)
idf_sents (list[str]): List of sentences used to compute the idf weights.
batch_size (int): Bert score processing batch size
nthreads (int): Number of threads.
- lang (str): Language of the sentences; has to specify at least one of
model_type or lang. lang needs to be specified when rescale_with_baseline is True.
rescale_with_baseline (bool): Rescale bertscore with pre-computed baseline.
baseline_path (str): Customized baseline file.
use_fast_tokenizer (bool): use_fast parameter passed to HF tokenizer.
fp16 (bool): Use float16 for the forward computation.
bf16 (bool): Use bfloat16 for the forward computation.
cpu (bool): Use CPU for the forward computation.
- score_type: BERTScoreScoreType = 2#
- corpus_score(hypotheses: list[str], references_lists: list[list[str]], sources: list[str] | None = None) float[source]#
Calculate the corpus-level score.
- property device: device#
Returns the device of the model.
- encode(sentences: list[str]) Cache[source]#
Encode the given sentences into their intermediate representations.
- out_proj(hypotheses_ir: Cache, references_ir: Cache, sources_ir: Cache | None = None) Tensor[source]#
Forward the output projection layer.
- pairwise_scores(hypotheses: list[str], references: list[str], *_, **__) Tensor[source]#
Calculate the pairwise scores.
- scorer: BERTScorer#
- scores(hypotheses: list[str], references: list[str], *_, **__) Tensor[source]#
Calculate the scores of the given hypothesis.
- tokenizer: PreTrainedTokenizerBase#