Maximum input+output tokens ??

#1
by ha1772007 - opened

Maximum input+output tokens ??

CoolSpring/Qwen2-0.5B-Abyme was trained with a sequence_len of 4096, while Qwen/Qwen2-0.5B-Instruct has a 32768 context length capability in the Needle in a Haystack task as per Qwen team claimed on their releasing blog post. So I would guess a number in between, leaning towards the low side.

However, it is still a guess, and personally I haven't used this model since it was done for experimental purposes. I'm happy to see you are interested in my created model, please take care!

Sign up or log in to comment