TurkuNLP
/

web-register-classification-multilingual

Text Classification

Inference Endpoints

Model card Files Files and versions Community

erikhenriksson commited on May 2

Commit

60b4878

•

1 Parent(s): 56ce875

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -13,8 +13,9 @@ metrics:
 A multilingual web register classifier, fine-tuned from XLM-RoBERTa-large.
 The model is trained with the multilingual CORE corpora across five languages (English, Finnish, French, Swedish, Turkish) to classify documents based on the CORE taxonomy, detailed below.
-The model demonstrates state-of-the-art performance in classifying web registers and achieves good zero-shot performance for additional languages.
 It is designed to support the development of open language models and for linguists analyzing register variation.
 ## Model Details
 ### Model Description
@@ -69,7 +70,7 @@ The main labels are uppercase. To only include these main labels in the predicti
 Use the code below to get started with the model.
-```
 import torch
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
@@ -133,7 +134,7 @@ Average inference time (across 1000 iterations), using a single NVIDIA A100 GPU
 | Language | F1 (All labels) | F1 (Main labels) |
 | -------- | --------------- | ---------------- |
-| English  | 0.72            |
 | Finnish  | 0.79            |
 | French   | 0.75            |
 | Swedish  | 0.81            |

 A multilingual web register classifier, fine-tuned from XLM-RoBERTa-large.
 The model is trained with the multilingual CORE corpora across five languages (English, Finnish, French, Swedish, Turkish) to classify documents based on the CORE taxonomy, detailed below.
+The model demonstrates state-of-the-art performance in classifying web registers and achieves good zero-shot performance for additional languages (see Evaluation below).
 It is designed to support the development of open language models and for linguists analyzing register variation.
 ## Model Details
 ### Model Description
 Use the code below to get started with the model.
+```python
 import torch
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
 | Language | F1 (All labels) | F1 (Main labels) |
 | -------- | --------------- | ---------------- |
+| English  | 0.72            | 0.75
 | Finnish  | 0.79            |
 | French   | 0.75            |
 | Swedish  | 0.81            |