Locutusque
/

TinyMistral-248M

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Locutusque commited on Nov 23, 2023

Commit

c768e26

•

1 Parent(s): 90b89d1

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -31,3 +31,6 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
 | Winogrande (5-shot)   | 50.75   |
 | GSM8K (5-shot)        | 0.0        |
 | DROP (3-shot)         | 0.74         |

 | Winogrande (5-shot)   | 50.75   |
 | GSM8K (5-shot)        | 0.0        |
 | DROP (3-shot)         | 0.74         |
+The purpose of this model is to prove that trillion-scale datasets are not needed to pretrain a language model. As a result of needing small datasets, this model was pretrained on a single GPU (Titan V).