roneneldan
/

TinyStories-33M

Text Generation

Inference Endpoints

Model card Files Files and versions Community

roneneldan commited on Aug 8, 2023

Commit

72da9b3

•

1 Parent(s): 190d22e

Update README.md

Files changed (1) hide show

README.md +9 -0

README.md CHANGED Viewed

@@ -8,6 +8,15 @@ Based on GPT-Neo architecture.
 License: mit
 ------ EXAMPLE USAGE ---
 from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

 License: mit
+---- hyperparams used to train this model ----
+lr = 5e-4
+lr_schedule = constant
+wd=0.1
+adam_beta1=0.9, adam_beta2 = 0.95
+context length=512
+batch size=80
+gradient accumulation steps=16
 ------ EXAMPLE USAGE ---
 from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig