File size: 240 Bytes
751936e
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13


```
added vocab (size: 54634) with 22 dummy tokens (new size: 54656)
Vocab size: 54634

训练数据
```


https://github.com/huggingface/transformers/blob/main/src/transformers/models/gpt_neox_japanese/tokenization_gpt_neox_japanese.py