chestnutlzj's picture
Update README.md
2b5cc0b
metadata
license: apache-2.0
language:
  - zh

4x1.8B MoE Qwen Ckpt 50000

This is a MoE model project constructed based on the Qwen 1.8B model. In this project, we concatenated 4 original models and trained them using special training methods.

This model is a checkpoint model for the continue pretraining stage.

Evaluations

Groups n-shot Metric Value Stderr
boolq 0 acc 0.6508 ± 0.0083
ceval-valid 0 acc 0.5290 ± 0.1912
0 acc_norm 0.5290 ± 0.1912
cmmlu 0 acc 0.5087 ± 0.1237
0 acc_norm 0.5087 ± 0.1237
mathqa 0 acc 0.2647 ± 0.0081
0 acc_norm 0.2693 ± 0.0081
mmlu 0 acc 0.4353 ± 0.0830
- stem 0 acc 0.3809 ± 0.0659
- social_sciences 0 acc 0.4959 ± 0.0708
- other 0 acc 0.4844 ± 0.0744
- humanities 0 acc 0.3998 ± 0.0849

Acknowledgements

License Agreement

This project is open source under the Tongyi Qianwen Research License Agreement. You can view the complete license agreement in this link: [https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT].

During the use of this project, please ensure that your usage behavior complies with the terms and conditions of the license agreement.