erikhenriksson
commited on
Commit
•
60b4878
1
Parent(s):
56ce875
Update README.md
Browse files
README.md
CHANGED
@@ -13,8 +13,9 @@ metrics:
|
|
13 |
|
14 |
A multilingual web register classifier, fine-tuned from XLM-RoBERTa-large.
|
15 |
The model is trained with the multilingual CORE corpora across five languages (English, Finnish, French, Swedish, Turkish) to classify documents based on the CORE taxonomy, detailed below.
|
16 |
-
The model demonstrates state-of-the-art performance in classifying web registers and achieves good zero-shot performance for additional languages.
|
17 |
It is designed to support the development of open language models and for linguists analyzing register variation.
|
|
|
18 |
## Model Details
|
19 |
|
20 |
### Model Description
|
@@ -69,7 +70,7 @@ The main labels are uppercase. To only include these main labels in the predicti
|
|
69 |
|
70 |
Use the code below to get started with the model.
|
71 |
|
72 |
-
```
|
73 |
import torch
|
74 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
75 |
|
@@ -133,7 +134,7 @@ Average inference time (across 1000 iterations), using a single NVIDIA A100 GPU
|
|
133 |
|
134 |
| Language | F1 (All labels) | F1 (Main labels) |
|
135 |
| -------- | --------------- | ---------------- |
|
136 |
-
| English | 0.72 |
|
137 |
| Finnish | 0.79 |
|
138 |
| French | 0.75 |
|
139 |
| Swedish | 0.81 |
|
|
|
13 |
|
14 |
A multilingual web register classifier, fine-tuned from XLM-RoBERTa-large.
|
15 |
The model is trained with the multilingual CORE corpora across five languages (English, Finnish, French, Swedish, Turkish) to classify documents based on the CORE taxonomy, detailed below.
|
16 |
+
The model demonstrates state-of-the-art performance in classifying web registers and achieves good zero-shot performance for additional languages (see Evaluation below).
|
17 |
It is designed to support the development of open language models and for linguists analyzing register variation.
|
18 |
+
|
19 |
## Model Details
|
20 |
|
21 |
### Model Description
|
|
|
70 |
|
71 |
Use the code below to get started with the model.
|
72 |
|
73 |
+
```python
|
74 |
import torch
|
75 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
76 |
|
|
|
134 |
|
135 |
| Language | F1 (All labels) | F1 (Main labels) |
|
136 |
| -------- | --------------- | ---------------- |
|
137 |
+
| English | 0.72 | 0.75
|
138 |
| Finnish | 0.79 |
|
139 |
| French | 0.75 |
|
140 |
| Swedish | 0.81 |
|