BLOOM API inference
Hi, I've been playing around with the API and discovered two interesting things that maybe someone could explain to me.
First: when I switch the model to greedy in the huggingface API, it works fine, but when I set "do_sample": False
in my request, it returns different results each time. Is there any possibility to force Bloom to be more deterministic?
Second: when i get response in json it skips space before punctuation e. g. "bla , bla" -> "bla, bla". Is there a way to avoid this behavior?
Thanks for answers!
To your first question, the parameter to set is do_sample
which might explain the non-determinism you're observing.
oh yess, I used do_sample
, typo here so I quickly edited question
I encountered the same problem as explained in your second question, is this perhaps the cause of the return_full_text
parameter not working which always seems to be True?
https://huggingface.co/bigscience/bloom/discussions/153#6397907b71eb2455d898e0a4