Spaces:

Ahmadzei
/

RAG

Runtime error

File size: 842 Bytes

57bdca5

For example, you can use the [TextStreamer] class to stream the output of generate() into
your screen, one word at a time:
thon

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
tok = AutoTokenizer.from_pretrained("openai-community/gpt2")
model = AutoModelForCausalLM.from_pretrained("openai-community/gpt2")
inputs = tok(["An increasing sequence: one,"], return_tensors="pt")
streamer = TextStreamer(tok)
Despite returning the usual output, the streamer will also print the generated text to stdout.
_ = model.generate(**inputs, streamer=streamer, max_new_tokens=20)
An increasing sequence: one, two, three, four, five, six, seven, eight, nine, ten, eleven,

Decoding strategies
Certain combinations of the generate() parameters, and ultimately generation_config, can be used to enable specific
decoding strategies.