# 4x1.8B MoE Qwen Ckpt 50000 This is a MoE model project constructed based on the Qwen 1.8B model. In this project, we concatenated 4 original models and trained them using special training methods. This model is a checkpoint model for the continue pretraining stage. ![](loss_plot.png) # Evaluations | Groups |n-shot| Metric |Value | |Stderr| |------------------|-----:|--------|-----:|---|-----:| |boolq | 0|acc |0.6508|± |0.0083| |ceval-valid | 0|acc |0.5290|± |0.1912| | | 0|acc_norm|0.5290|± |0.1912| |cmmlu | 0|acc |0.5087|± |0.1237| | | 0|acc_norm|0.5087|± |0.1237| |mathqa | 0|acc |0.2647|± |0.0081| | | 0|acc_norm|0.2693|± |0.0081| |mmlu | 0|acc |0.4353|± |0.0830| | - stem | 0|acc |0.3809|± |0.0659| | - social_sciences| 0|acc |0.4959|± |0.0708| | - other | 0|acc |0.4844|± |0.0744| | - humanities | 0|acc |0.3998|± |0.0849| # Acknowledgements + [Qwen](https://github.com/QwenLM/Qwen) + [mistral.ai](https://mistral.ai) # License Agreement This project is open source under the Tongyi Qianwen Research License Agreement. You can view the complete license agreement in this link: [https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT]. During the use of this project, please ensure that your usage behavior complies with the terms and conditions of the license agreement.