File size: 1,619 Bytes
2b5cc0b
 
 
 
 
84c7ea9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2b5cc0b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
license: apache-2.0
language:
- zh
---
# 4x1.8B MoE Qwen Ckpt 50000

This is a MoE model project constructed based on the Qwen 1.8B model. In this project, we concatenated 4 original models and trained them using special training methods.

This model is a checkpoint model for the continue pretraining stage.

![](loss_plot.png)

# Evaluations

|      Groups      |n-shot| Metric |Value |   |Stderr|
|------------------|-----:|--------|-----:|---|-----:|
|boolq             |     0|acc     |0.6508|±  |0.0083|
|ceval-valid       |     0|acc     |0.5290|±  |0.1912|
|                  |     0|acc_norm|0.5290|±  |0.1912|
|cmmlu             |     0|acc     |0.5087|±  |0.1237|
|                  |     0|acc_norm|0.5087|±  |0.1237|
|mathqa            |     0|acc     |0.2647|±  |0.0081|
|                  |     0|acc_norm|0.2693|±  |0.0081|
|mmlu              |     0|acc     |0.4353|±  |0.0830|
| - stem           |     0|acc     |0.3809|±  |0.0659|
| - social_sciences|     0|acc     |0.4959|±  |0.0708|
| - other          |     0|acc     |0.4844|±  |0.0744|
| - humanities     |     0|acc     |0.3998|±  |0.0849|

# Acknowledgements

+ [Qwen](https://github.com/QwenLM/Qwen)
+ [mistral.ai](https://mistral.ai)

# License Agreement

This project is open source under the Tongyi Qianwen Research License Agreement. You can view the complete license agreement in this link: [https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT].

During the use of this project, please ensure that your usage behavior complies with the terms and conditions of the license agreement.