blair-johnson
commited on
Commit
•
513cd61
1
Parent(s):
0d57b8c
Update README.md
Browse files
README.md
CHANGED
@@ -39,6 +39,10 @@ Fine-tuning the base GALACTICA models on the 52k instruction-response pairs in t
|
|
39 |
The GALPACA weights are made available for use with the `transformers` library.
|
40 |
TODO: add example inference usage.
|
41 |
|
|
|
|
|
|
|
|
|
42 |
## Performance and Limitations
|
43 |
|
44 |
More information about the performance and limitations of this family of models can be found on the original GALACTICA model card.
|
|
|
39 |
The GALPACA weights are made available for use with the `transformers` library.
|
40 |
TODO: add example inference usage.
|
41 |
|
42 |
+
## Training Resources
|
43 |
+
|
44 |
+
GALPACA 30B was fine-tuned in about 8 hours using 16 A100 80GB GPUS at an effective batch-size of 1024 and with a maximum context window of 384 tokens. This model was trained using DeepSpeed Stage 3 optimizations.
|
45 |
+
|
46 |
## Performance and Limitations
|
47 |
|
48 |
More information about the performance and limitations of this family of models can be found on the original GALACTICA model card.
|