Independent analysis

by yaronr - opened 4 days ago

4 days ago

Hi
I'm pleased to share our independent evaluation of the model using our implementation of the MMLU-Pro benchmark. I hope you find this useful.
The results demonstrate impressive performance for the model across multiple categories compared with other models, including many surprising ones (see 'Unity Subjects' tab for detailed breakdown).

We will release additional benchmarks and cost/performance data as time permits.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment