Independent analysis
#7
by
yaronr
- opened
Hi
I'm pleased to share our independent evaluation of the model using our implementation of the MMLU-Pro benchmark. I hope you find this useful.
The results demonstrate impressive performance for the model across multiple categories compared with other models, including many surprising ones (see 'Unity Subjects' tab for detailed breakdown).
We will release additional benchmarks and cost/performance data as time permits.