rhysjones commited on
Commit
fa70f50
1 Parent(s): 20bbca2

Updates to evaluations from Yet Another LLM Leaderboard results

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -35,7 +35,7 @@ Evaluations done using mlabonne's usefull [Colab notebook llm-autoeval](https://
35
  Also check out the alternative leaderboard at [Yet_Another_LLM_Leaderboard](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard)
36
  | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
37
  |----------------------------------------------------------------|------:|------:|---------:|-------:|------:|
38
- |[phi-2-orange](https://huggingface.co/rhysjones/phi-2-orange)| **33.29**| 71.39| 49.9| 37.14| **47.93**|
39
  |[phi-2-dpo](https://huggingface.co/lxuechen/phi-2-dpo)| 30.39| **71.68**| **50.75**| 34.9| 46.93|
40
- |[dolphin-2_6-phi-2](https://huggingface.co/cognitivecomputations/dolphin-2_6-phi-2)| 33.12| 69.85| 47.39| **37.2**| 46.89|
41
  |[phi-2](https://huggingface.co/microsoft/phi-2)| 27.98| 70.8| 44.43| 35.21| 44.61|
 
35
  Also check out the alternative leaderboard at [Yet_Another_LLM_Leaderboard](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard)
36
  | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
37
  |----------------------------------------------------------------|------:|------:|---------:|-------:|------:|
38
+ |[phi-2-orange](https://huggingface.co/rhysjones/phi-2-orange)| **33.37**| 71.33| 49.87| **37.3**| **47.97**|
39
  |[phi-2-dpo](https://huggingface.co/lxuechen/phi-2-dpo)| 30.39| **71.68**| **50.75**| 34.9| 46.93|
40
+ |[dolphin-2_6-phi-2](https://huggingface.co/cognitivecomputations/dolphin-2_6-phi-2)| 33.12| 69.85| 47.39| 37.2| 46.89|
41
  |[phi-2](https://huggingface.co/microsoft/phi-2)| 27.98| 70.8| 44.43| 35.21| 44.61|