You need to specify the num_beams greater than 1, and set do_sample=True to use this decoding strategy. thon from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, set_seed set_seed(0) # For reproducibility prompt = "translate English to German: The house is wonderful." checkpoint = "google-t5/t5-small" tokenizer = AutoTokenizer.from_pretrained(checkpoint) inputs = tokenizer(prompt, return_tensors="pt") model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint) outputs = model.generate(**inputs, num_beams=5, do_sample=True) tokenizer.decode(outputs[0], skip_special_tokens=True) 'Das Haus ist wunderbar.' Diverse beam search decoding The diverse beam search decoding strategy is an extension of the beam search strategy that allows for generating a more diverse set of beam sequences to choose from.