xDAN2099's picture
Update README.md
8c234f5
metadata
license: cc-by-4.0
language:
  - en
  - zh
metrics:
  - accuracy

image/png

A TOP Finetuned Model by xDAN-AI

🤖 #TOP1 on MT-bench scoring 8.45, outperforming GPT3.5 turbo & 70B models !🤖

🤖 #TOP2 on C-Eval scoring 79.45, outperforming GPT3.5 turbo & 70B models !🤖

Exceptional Performance in Key Areas:

MT-Bench Leadboard TOP2

image/png

C-Eval Leadboard TOP2

Order Model Creator Submission Date Avg Avg(Hard) STEM Social Science Humanities Others
1 Yi-34B 零一万物 2023/11/2 81.4 58.7 73.7 89.6 84.6 84.9
2 xDAN-L2-Chat xDAN-AI 新旦智能 2023/11/10 79.27 69.07 69.07 87.64 85.99 80.21
3 BlueLM-7B vivo 2023/11/7 73.3 48.9 64.3 83.3 76.5 77.1
4 Qwen-14B Alibaba Cloud 2023/9/22 72.1 53.7 65.7 85.4 75.3 68.4
5 Yi-6B 零一万物 2023/11/2 72 46.6 62.3 83.9 76.3 74.6
6 XuanYuan-70B 度小满AI-Lab 2023/9/21 71.9 53.6 67.7 83.3 73.9 67.4
7 ChatGLM3-6B-base Tsinghua & Zhipu.AI 2023/10/26 69 46.8 61 82.4 73.4 66.9
8 GPT-4* OpenAI 2023/5/15 68.7 54.9 67.1 77.6 64.5 67.8
9 XVERSE-65B XVERSE Technology 2023/11/5 68.6 46.2 61.3 81.4 71 67.8
10 Nanbeige-16B-Base Nanbeige LLM Lab 2023/11/8 63.8 43.5 57.8 77.2 66.9 59.4
11 LingoWhale-8B 深言科技(DeepLangAI) 2023/11/3 63.6 46.4 57 73.7 68.5 61.5
12 Qwen-7B v1.1 Alibaba Cloud 2023/9/12 63.5 46.4 57.7 78.1 66.6 57.8
13 ChatGPT* OpenAI 2023/5/15 54.4 41.4 52.9 61.8 50.9 53.6
14 Claude-v1.3* Anthropic 2023/5/15 54.2 39 51.9 61.7 52.1 53.7
15 Baichuan-13B Baichuan 2023/7/9 53.6 36.7 47 66.8 57.3 49.8