Locutusque
commited on
Commit
•
c768e26
1
Parent(s):
90b89d1
Update README.md
Browse files
README.md
CHANGED
@@ -31,3 +31,6 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
|
|
31 |
| Winogrande (5-shot) | 50.75 |
|
32 |
| GSM8K (5-shot) | 0.0 |
|
33 |
| DROP (3-shot) | 0.74 |
|
|
|
|
|
|
|
|
31 |
| Winogrande (5-shot) | 50.75 |
|
32 |
| GSM8K (5-shot) | 0.0 |
|
33 |
| DROP (3-shot) | 0.74 |
|
34 |
+
|
35 |
+
|
36 |
+
The purpose of this model is to prove that trillion-scale datasets are not needed to pretrain a language model. As a result of needing small datasets, this model was pretrained on a single GPU (Titan V).
|