update readme.md
Browse files
README.md
CHANGED
@@ -19,7 +19,15 @@ tags:
|
|
19 |
- **Vocabulary Size**: 86,000
|
20 |
- **Total Number of Tokens**: 1,233,628
|
21 |
- **Fertility Score**: 1.589
|
22 |
-
- It supports Arabic Diacritization
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
## How to Use the Aranizer Tokenizer
|
25 |
|
|
|
19 |
- **Vocabulary Size**: 86,000
|
20 |
- **Total Number of Tokens**: 1,233,628
|
21 |
- **Fertility Score**: 1.589
|
22 |
+
- It supports Arabic Diacritization
|
23 |
+
|
24 |
+
## Aranizer Collection Achieved State of the Art Arabic Tokenizer
|
25 |
+
|
26 |
+
The Aranizer tokenizer has achieved state-of-the-art results on the [Arabic Tokenizers Leaderboard](https://huggingface.co/spaces/MohamedRashad/arabic-tokenizers-leaderboard) on Hugging Face. Below is a screenshot highlighting this achievement:
|
27 |
+
|
28 |
+
<img src="./lb.png" alt="Screenshot showing the Aranizer Tokenizer achieving state of the art" width="800">
|
29 |
+
|
30 |
+
|
31 |
|
32 |
## How to Use the Aranizer Tokenizer
|
33 |
|