tlphams commited on
Commit
5dceb6c
1 Parent(s): 9aa112f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -22,14 +22,14 @@ The following models were included in the merge:
22
  ## Benchmark results
23
  ### 1. MT-Bench from lmsys
24
  We adapted the code from [FastChat](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) to benchmark our model with GPT-4 as a judge. Here is the result
25
- ```
26
- | | Model | Turn | Score |
27
- |-------|--------------------------|------|----------|
28
- | First | tlphams/Wizard-Zephyr-Orpo-8x22B | 1 | 9.1625 |
29
- | | mistralai/Mixtral-8x22B-Instruct-v0.1 | 1 | 9.1500 |
30
- | Second| tlphams/Wizard-Zephyr-Orpo-8x22B | 2 | 8.873418 |
31
- | | mistralai/Mixtral-8x22B-Instruct-v0.1 | 2 | 8.250000 |
32
- | Average| tlphams/Wizard-Zephyr-Orpo-8x22B | | 9.018868 |
33
- | | mistralai/Mixtral-8x22B-Instruct-v0.1 | | 8.700000 |
34
  ```
35
  The score is slightly lower than [alpindale/WizardLM-2-8x22B](https://huggingface.co/alpindale/WizardLM-2-8x22B), but still higher than GPT-4-0314. Then the research and experimental work still need to continue ^^
 
22
  ## Benchmark results
23
  ### 1. MT-Bench from lmsys
24
  We adapted the code from [FastChat](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) to benchmark our model with GPT-4 as a judge. Here is the result
25
+ ```markdown
26
+ | | Model | Turn | Score |
27
+ |--------|-----------------------------------------|------|----------|
28
+ | First | tlphams/Wizard-Zephyr-Orpo-8x22B | 1 | 9.1625 |
29
+ | | mistralai/Mixtral-8x22B-Instruct-v0.1 | 1 | 9.1500 |
30
+ | Second | tlphams/Wizard-Zephyr-Orpo-8x22B | 2 | 8.873418 |
31
+ | | mistralai/Mixtral-8x22B-Instruct-v0.1 | 2 | 8.250000 |
32
+ | Average| tlphams/Wizard-Zephyr-Orpo-8x22B | | 9.018868 |
33
+ | | mistralai/Mixtral-8x22B-Instruct-v0.1 | | 8.700000 |
34
  ```
35
  The score is slightly lower than [alpindale/WizardLM-2-8x22B](https://huggingface.co/alpindale/WizardLM-2-8x22B), but still higher than GPT-4-0314. Then the research and experimental work still need to continue ^^