Update README.md
#13
by
mmendoza
- opened
README.md
CHANGED
@@ -55,7 +55,7 @@ The models are made available under a non-commercial CC BY-NC 4.0 license. More
|
|
55 |
|
56 |
## Training Data
|
57 |
|
58 |
-
The GALACTICA models are trained on 106 billion tokens of open-access scientific text and data. This includes papers, textbooks, scientific websites, encyclopedias, reference material, knowledge bases, and more. We tokenize different modalities to provide a natural
|
59 |
|
60 |
## How to use
|
61 |
|
|
|
55 |
|
56 |
## Training Data
|
57 |
|
58 |
+
The GALACTICA models are trained on 106 billion tokens of open-access scientific text and data. This includes papers, textbooks, scientific websites, encyclopedias, reference material, knowledge bases, and more. We tokenize different modalities to provide a natural language interface for different tasks. See the README.md for more information. See the paper for full information on the training data.
|
59 |
|
60 |
## How to use
|
61 |
|