Locutusque
commited on
Commit
•
e66b659
1
Parent(s):
8dd987b
Update README.md
Browse files
README.md
CHANGED
@@ -38,7 +38,7 @@ Llama-3-Hercules-5.0-8B is well-suited to the following applications:
|
|
38 |
- This model was trained on 8 kaggle TPUs, using torch xla SPMD for high MXU efficiency. There was no expense on my end (meaning you can reproduce this too!)
|
39 |
- A learning rate of 2e-5 with the Adam optimizer. A linear scheduler was used, with an end factor of 0.005.
|
40 |
- No mixed precision was used, with the default dtype being bfloat16.
|
41 |
-
- A total batch size of
|
42 |
- Trained on all examples of Hercules-v5.0 for 2 epochs
|
43 |
- No model parameters were frozen and no quantization was used.
|
44 |
- This model was trained on OpenAI's ChatML prompt format. Because this model has function calling capabilities, the prompt format is slightly different, here's what it would look like: ```<|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{user message}<|im_end|>\n<|im_start|>call\n{function call message}<|im_end|>\n<|im_start|>function\n{function response message}<|im_end|>\n<|im_start|>assistant\n{assistant message}</s>```
|
|
|
38 |
- This model was trained on 8 kaggle TPUs, using torch xla SPMD for high MXU efficiency. There was no expense on my end (meaning you can reproduce this too!)
|
39 |
- A learning rate of 2e-5 with the Adam optimizer. A linear scheduler was used, with an end factor of 0.005.
|
40 |
- No mixed precision was used, with the default dtype being bfloat16.
|
41 |
+
- A total batch size of 128 was used.
|
42 |
- Trained on all examples of Hercules-v5.0 for 2 epochs
|
43 |
- No model parameters were frozen and no quantization was used.
|
44 |
- This model was trained on OpenAI's ChatML prompt format. Because this model has function calling capabilities, the prompt format is slightly different, here's what it would look like: ```<|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{user message}<|im_end|>\n<|im_start|>call\n{function call message}<|im_end|>\n<|im_start|>function\n{function response message}<|im_end|>\n<|im_start|>assistant\n{assistant message}</s>```
|