tlphams
/

Wizard-Zephyr-Orpo-8x22B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

tlphams commited on May 6

Commit

9aa112f

•

1 Parent(s): ec0b388

Update README.md

Files changed (1) hide show

README.md +8 -14

README.md CHANGED Viewed

@@ -23,19 +23,13 @@ The following models were included in the merge:
 ### 1. MT-Bench from lmsys
 We adapted the code from [FastChat](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) to benchmark our model with GPT-4 as a judge. Here is the result
 ```
-########## First turn ##########
-                           score
-model               turn
-wizard-zephyr-8x22b 1     9.1625
-########## Second turn ##########
-                             score
-model               turn
-wizard-zephyr-8x22b 2     8.873418
-########## Average ##########
-                        score
-model
-wizard-zephyr-8x22b  9.018868
 ```
 The score is slightly lower than [alpindale/WizardLM-2-8x22B](https://huggingface.co/alpindale/WizardLM-2-8x22B), but still higher than GPT-4-0314. Then the research and experimental work still need to continue ^^

 ### 1. MT-Bench from lmsys
 We adapted the code from [FastChat](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) to benchmark our model with GPT-4 as a judge. Here is the result
 ```
+|       | Model                    | Turn | Score    |
+|-------|--------------------------|------|----------|
+| First | tlphams/Wizard-Zephyr-Orpo-8x22B      | 1    | 9.1625   |
+|       | mistralai/Mixtral-8x22B-Instruct-v0.1   | 1    | 9.1500   |
+| Second| tlphams/Wizard-Zephyr-Orpo-8x22B      | 2    | 8.873418 |
+|       | mistralai/Mixtral-8x22B-Instruct-v0.1   | 2    | 8.250000 |
+| Average| tlphams/Wizard-Zephyr-Orpo-8x22B     |      | 9.018868 |
+|        | mistralai/Mixtral-8x22B-Instruct-v0.1  |      | 8.700000 |
 ```
 The score is slightly lower than [alpindale/WizardLM-2-8x22B](https://huggingface.co/alpindale/WizardLM-2-8x22B), but still higher than GPT-4-0314. Then the research and experimental work still need to continue ^^