File size: 310 Bytes
5148d42 95d936b bdd71af fbe0056 bdd71af |
1 2 3 4 5 6 7 8 9 10 11 12 13 |
---
library_name: transformers
tags: []
---
XLM Roberta Tokenizer trained with 162M tokens of Khmer text.
```python
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("seanghay/xlm-roberta-khmer-32k-tokenizer")
tokenizer.tokenize("αα½ααααΈααααα»ααΆ!")
``` |