BramVanroy
commited on
Commit
•
1e5a668
1
Parent(s):
b106433
Update README.md
Browse files
README.md
CHANGED
@@ -33,7 +33,8 @@ wanted to see if the performance would be reasonable after finetuning this model
|
|
33 |
Trained on the [yhavinga/mc4_nl_cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned/viewer/tiny/train) dataset (`tiny` partition) for one epoch. The canonical
|
34 |
validation split was not used but instead 5% of `train` was used as validation.
|
35 |
|
36 |
-
At 2048 tokens context length, the training set was around 2M (2,008,858) samples, and the model was trained for 1 epoch.
|
|
|
37 |
|
38 |
|
39 |
## Training procedure
|
|
|
33 |
Trained on the [yhavinga/mc4_nl_cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned/viewer/tiny/train) dataset (`tiny` partition) for one epoch. The canonical
|
34 |
validation split was not used but instead 5% of `train` was used as validation.
|
35 |
|
36 |
+
At 2048 tokens context length, the training set was around 2M (2,008,858) samples, and the model was trained for 1 epoch. That means that the model was trained for
|
37 |
+
around 4B Dutch tokens (`2048 * 2008858 = 4.114.141.184`).
|
38 |
|
39 |
|
40 |
## Training procedure
|