You should try training a model with 2B parameters and context length 32000.
#3
by
win10
- opened
You should try training a model with 2B parameters and context length 32000.
Wishing you a happy new year and a successful new year.
I am more interested on why @PY007 settled on 1.1B? How did he/she come out with this number?