Tips#
Oracle selection#
If you have true references, you can get the oracle outputs.
The below is an example of COMET oracle selection.
Tip
MBR decoding with a single true reference selects the oracle hypothesis.
from mbrs.metrics import MetricCOMET
from mbrs.decoders import DecoderMBR
SOURCE = "ありがとう"
TRUE_REFERENCES = ["Thank you"]
HYPOTHESES = ["Thanks", "Thank you", "Thank you so much", "Thank you.", "thank you"]
metric_cfg = MetricCOMET.Config(model="Unbabel/wmt22-comet-da")
metric = MetricCOMET(metric_cfg)
decoder_cfg = DecoderMBR.Config()
decoder = DecoderMBR(decoder_cfg, metric)
output = decoder.decode(HYPOTHESES, TRUE_REFERENCES, source=SOURCE, nbest=1)
print(f"Selected index: {output.idx}")
print(f"Output sentence: {output.sentence}")
print(f"Expected score: {output.score}")
mbrs-decode \
hypotheses.txt \
--num_candidates 1024 \
--references true_references.txt \
--num_references 1 \
--source sources.txt \
--output oracle_translations.txt \
--decoder mbr \
--metric comet --metric.model "Unbabel/wmt22-comet-da"