slippylolo
commited on
Commit
•
650217d
1
Parent(s):
c1a49e6
Add link to RefinedWeb
Browse files
README.md
CHANGED
@@ -126,7 +126,7 @@ Falcon-7B was trained on 1,500B tokens of [RefinedWeb](https://huggingface.co/da
|
|
126 |
|
127 |
| **Data source** | **Fraction** | **Tokens** | **Sources** |
|
128 |
|--------------------|--------------|------------|-----------------------------------|
|
129 |
-
| RefinedWeb-English | 79% | 1,185B | massive web crawl |
|
130 |
| Books | 7% | 110B | |
|
131 |
| Conversations | 6% | 85B | Reddit, StackOverflow, HackerNews |
|
132 |
| Code | 3% | 45B | |
|
|
|
126 |
|
127 |
| **Data source** | **Fraction** | **Tokens** | **Sources** |
|
128 |
|--------------------|--------------|------------|-----------------------------------|
|
129 |
+
| [RefinedWeb-English](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) | 79% | 1,185B | massive web crawl |
|
130 |
| Books | 7% | 110B | |
|
131 |
| Conversations | 6% | 85B | Reddit, StackOverflow, HackerNews |
|
132 |
| Code | 3% | 45B | |
|