blair-johnson commited on
Commit
513cd61
1 Parent(s): 0d57b8c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -39,6 +39,10 @@ Fine-tuning the base GALACTICA models on the 52k instruction-response pairs in t
39
  The GALPACA weights are made available for use with the `transformers` library.
40
  TODO: add example inference usage.
41
 
 
 
 
 
42
  ## Performance and Limitations
43
 
44
  More information about the performance and limitations of this family of models can be found on the original GALACTICA model card.
 
39
  The GALPACA weights are made available for use with the `transformers` library.
40
  TODO: add example inference usage.
41
 
42
+ ## Training Resources
43
+
44
+ GALPACA 30B was fine-tuned in about 8 hours using 16 A100 80GB GPUS at an effective batch-size of 1024 and with a maximum context window of 384 tokens. This model was trained using DeepSpeed Stage 3 optimizations.
45
+
46
  ## Performance and Limitations
47
 
48
  More information about the performance and limitations of this family of models can be found on the original GALACTICA model card.