Update README.md
Browse files
README.md
CHANGED
@@ -140,13 +140,13 @@ The model was trained for 1T tokens (with batch size 1760 and sequence length 20
|
|
140 |
| Data Source | Number of Tokens in Source | Proportion | Effective Number of Tokens | Epochs |
|
141 |
|-------------|----------------------------|------------|----------------------------|--------|
|
142 |
| mC4 3.1.0 - English | 417.99 B | 0.33 | 330 B | 0.14 |
|
143 |
-
| C4 - English - SemDedup 80% | 100.42 B | 0.
|
144 |
| RedPajama - CommonCrawl | 878.45 B | 0.1 | 100 B | 0.11 |
|
145 |
| The Stack - Selected Languages | 463.78 B | 0.1 | 100 B | 0.22 |
|
146 |
| RedPajama - Wikipedia - En | 4.87 B | 0.04 | 40 B | 8.21 |
|
147 |
| The Stack - Markdown | 107.07 B | 0.035 | 35 B | 0.33 |
|
148 |
-
| S2ORC | 48.85 B | 0.
|
149 |
-
| RedPajama - Books | 26.02 B | 0.
|
150 |
| RedPajama - arXiv | 28.10 B | 0.019 | 19 B | 0.68 |
|
151 |
| RedPajama - StackExchange | 20.54 B | 0.014 | 14 B |0.68 |
|
152 |
|
|
|
140 |
| Data Source | Number of Tokens in Source | Proportion | Effective Number of Tokens | Epochs |
|
141 |
|-------------|----------------------------|------------|----------------------------|--------|
|
142 |
| mC4 3.1.0 - English | 417.99 B | 0.33 | 330 B | 0.14 |
|
143 |
+
| C4 - English - SemDedup 80% | 100.42 B | 0.299 | 299 B | 2.98 |
|
144 |
| RedPajama - CommonCrawl | 878.45 B | 0.1 | 100 B | 0.11 |
|
145 |
| The Stack - Selected Languages | 463.78 B | 0.1 | 100 B | 0.22 |
|
146 |
| RedPajama - Wikipedia - En | 4.87 B | 0.04 | 40 B | 8.21 |
|
147 |
| The Stack - Markdown | 107.07 B | 0.035 | 35 B | 0.33 |
|
148 |
+
| S2ORC | 48.85 B | 0.033 | 33 B | 0.68 |
|
149 |
+
| RedPajama - Books | 26.02 B | 0.03 | 30B | 1.15 |
|
150 |
| RedPajama - arXiv | 28.10 B | 0.019 | 19 B | 0.68 |
|
151 |
| RedPajama - StackExchange | 20.54 B | 0.014 | 14 B |0.68 |
|
152 |
|