losyer8 commited on
Commit
01d1e8e
1 Parent(s): a457786

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -124,7 +124,7 @@ The models have been pre-trained using a blend of the following datasets.
124
  |Codes|[The Stack](https://huggingface.co/datasets/bigcode/the-stack)|10B
125
 
126
  The pre-training was continuously conducted using a total of 10 folds of non-overlapping data, each consisting of approximately 27-28B tokens.
127
- We finalized the pre-training with additional (potentially) high-quality 27B tokens data obtained from the identical source data sets listed above used for the 10-fold data.
128
 
129
  ### Instruction tuning
130
 
 
124
  |Codes|[The Stack](https://huggingface.co/datasets/bigcode/the-stack)|10B
125
 
126
  The pre-training was continuously conducted using a total of 10 folds of non-overlapping data, each consisting of approximately 27-28B tokens.
127
+ We finalized the pre-training with additional (potentially) high-quality 27B tokens data obtained from the identical source datasets listed above used for the 10-fold data.
128
 
129
  ### Instruction tuning
130