Upload the tokenizer and corresponding files
#1
by
DrewG
- opened
This PR uploads the tokenizer (vocab size == 50k) to the repo, along with a json specifying the 3 special tokens we use and the tokenizer congifuration we used to train it.
DrewG
changed pull request status to
merged