OrionZheng
commited on
Commit
•
dacb39d
1
Parent(s):
edf2730
Update README.md
Browse files
README.md
CHANGED
@@ -8,14 +8,14 @@ license: apache-2.0
|
|
8 |
</p>
|
9 |
<hr>
|
10 |
|
11 |
-
# OpenMoE-8B(
|
12 |
OpenMoE is a project aimed at igniting the open-source MoE community! We are releasing a family of open-sourced Mixture-of-Experts (MoE) Large Language Models.
|
13 |
|
14 |
Our project began in the summer of 2023. On August 22, 2023, we released the first batch of intermediate checkpoints (OpenMoE-base&8B), along with the data and code [[Twitter]](https://twitter.com/xuefz/status/1693696988611739947?s=61&t=Xc2k2W7vU_hlpNizGDCmOw). Subsequently, the OpenMoE-8B training was completed in November, 2023. After that, we embarked on explorations on 34B scale model, which is still ongoing.
|
15 |
|
16 |
As a small student team, instead of pursuing the best model with better data, computation, and human power, we devote to fully sharing our training data, strategies, model architecture, weights, and everything we have with the community. We hope this project will promote research on this promising field and invite more contributors to work on open-sourced MoE projects together!
|
17 |
|
18 |
-
[2024.01.12] The paper for the project and more evaluations are underway. For more information about the model, training, and evaluations, please visit our GitHub [repository](https://github.com/XueFuzhao/OpenMoE/tree/main).
|
19 |
|
20 |
|
21 |
## Model Weights
|
|
|
8 |
</p>
|
9 |
<hr>
|
10 |
|
11 |
+
# OpenMoE-8B(400B tokens)
|
12 |
OpenMoE is a project aimed at igniting the open-source MoE community! We are releasing a family of open-sourced Mixture-of-Experts (MoE) Large Language Models.
|
13 |
|
14 |
Our project began in the summer of 2023. On August 22, 2023, we released the first batch of intermediate checkpoints (OpenMoE-base&8B), along with the data and code [[Twitter]](https://twitter.com/xuefz/status/1693696988611739947?s=61&t=Xc2k2W7vU_hlpNizGDCmOw). Subsequently, the OpenMoE-8B training was completed in November, 2023. After that, we embarked on explorations on 34B scale model, which is still ongoing.
|
15 |
|
16 |
As a small student team, instead of pursuing the best model with better data, computation, and human power, we devote to fully sharing our training data, strategies, model architecture, weights, and everything we have with the community. We hope this project will promote research on this promising field and invite more contributors to work on open-sourced MoE projects together!
|
17 |
|
18 |
+
**[2024.01.12]** The paper for the project and more evaluations are underway. For more information about the model, training, and evaluations, please visit our GitHub [repository](https://github.com/XueFuzhao/OpenMoE/tree/main).
|
19 |
|
20 |
|
21 |
## Model Weights
|