Update README.md (#19)
Browse files- Update README.md (8e3db4e61084625569bc3f5b7dc9d5d26d2ef56d)
Co-authored-by: Robert <roboojack@users.noreply.huggingface.co>
README.md
CHANGED
@@ -139,7 +139,7 @@ Falcon-7B was trained on 1,500B tokens of [RefinedWeb](https://huggingface.co/da
|
|
139 |
| Conversations | 6% | 85B | Reddit, StackOverflow, HackerNews |
|
140 |
| Code | 3% | 45B | |
|
141 |
| RefinedWeb-French | 3% | 45B | massive web crawl |
|
142 |
-
| Technical | 2% | 30B | arXiv, PubMed,
|
143 |
|
144 |
|
145 |
The data was tokenized with the Falcon-[7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) tokenizer.
|
|
|
139 |
| Conversations | 6% | 85B | Reddit, StackOverflow, HackerNews |
|
140 |
| Code | 3% | 45B | |
|
141 |
| RefinedWeb-French | 3% | 45B | massive web crawl |
|
142 |
+
| Technical | 2% | 30B | arXiv, PubMed, USPTO, etc. |
|
143 |
|
144 |
|
145 |
The data was tokenized with the Falcon-[7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) tokenizer.
|