Coding performance of base model?
#11
by
rombodawg
- opened
Are you able to bench the fp16 model files on humaneval?
Id love to see how the coding performance is especially considering its a mixture of experts model, and those generally do well.
It hasn't been task finetuned at all, so it would probably make sense to await something like EVOL-Instruct, Chain of Code, or w/e is most current this week to be applied to the base model before doing a code eval.
@ricofix base coding performance is just as important as any other type of eval. Trust me you want to bench it before its finetuned
HumanEval Score is about 32.9%.
Thanks @TechxGenus You are the GOAT