Ahmadzei's picture
update 1
57bdca5
raw
history blame
528 Bytes
Set the forced_bos_token_id to en in the generate method to translate to English:
generated_tokens = model.generate(**encoded_en, forced_bos_token_id=tokenizer.lang_code_to_id["en_XX"])
tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
"Don't interfere with the wizard's affairs, because they are subtle, will soon get angry."
If you are using the facebook/mbart-large-50-many-to-one-mmt checkpoint, you don't need to force the target language id as the first generated token otherwise the usage is the same..