GeorgiaTechResearchInstitute
/

galpaca-30b

Text Generation

text-generation-inference

Model card Files Files and versions Community

blair-johnson commited on Mar 30, 2023

Commit

513cd61

•

1 Parent(s): 0d57b8c

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -39,6 +39,10 @@ Fine-tuning the base GALACTICA models on the 52k instruction-response pairs in t
 The GALPACA weights are made available for use with the `transformers` library.
 TODO: add example inference usage.
 ## Performance and Limitations
 More information about the performance and limitations of this family of models can be found on the original GALACTICA model card.

 The GALPACA weights are made available for use with the `transformers` library.
 TODO: add example inference usage.
+## Training Resources
+GALPACA 30B was fine-tuned in about 8 hours using 16 A100 80GB GPUS at an effective batch-size of 1024 and with a maximum context window of 384 tokens. This model was trained using DeepSpeed Stage 3 optimizations.
 ## Performance and Limitations
 More information about the performance and limitations of this family of models can be found on the original GALACTICA model card.