fedcsis-slot_baseline-xlm_r-en
This model is a fine-tuned version of xlm-roberta-base on the leyzer-fedcsis dataset.
Results on test set:
- Precision: 0.7767
- Recall: 0.7991
- F1: 0.7877
- Accuracy: 0.9425
It achieves the following results on the evaluation set:
- Loss: 0.1097
- Precision: 0.9705
- Recall: 0.9723
- F1: 0.9714
- Accuracy: 0.9859
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
1.2866 | 1.0 | 814 | 0.3188 | 0.8661 | 0.8672 | 0.8666 | 0.9250 |
0.1956 | 2.0 | 1628 | 0.1299 | 0.9409 | 0.9471 | 0.9440 | 0.9736 |
0.1063 | 3.0 | 2442 | 0.1196 | 0.9537 | 0.9607 | 0.9572 | 0.9810 |
0.0558 | 4.0 | 3256 | 0.0789 | 0.9661 | 0.9697 | 0.9679 | 0.9854 |
0.0367 | 5.0 | 4070 | 0.0824 | 0.9685 | 0.9690 | 0.9687 | 0.9848 |
0.031 | 6.0 | 4884 | 0.0887 | 0.9712 | 0.9728 | 0.9720 | 0.9859 |
0.0233 | 7.0 | 5698 | 0.0829 | 0.9736 | 0.9744 | 0.9740 | 0.9872 |
0.0139 | 8.0 | 6512 | 0.0879 | 0.9743 | 0.9747 | 0.9745 | 0.9876 |
0.007 | 9.0 | 7326 | 0.0978 | 0.9740 | 0.9734 | 0.9737 | 0.9870 |
0.0076 | 10.0 | 8140 | 0.1015 | 0.9723 | 0.9726 | 0.9725 | 0.9860 |
0.026 | 11.0 | 814 | 0.1264 | 0.9732 | 0.9620 | 0.9676 | 0.9829 |
0.0189 | 12.0 | 1628 | 0.0975 | 0.9732 | 0.9711 | 0.9722 | 0.9861 |
0.0099 | 13.0 | 2442 | 0.1080 | 0.9721 | 0.9715 | 0.9718 | 0.9866 |
0.0052 | 14.0 | 3256 | 0.1052 | 0.9706 | 0.9715 | 0.9710 | 0.9860 |
0.0031 | 15.0 | 4070 | 0.1097 | 0.9705 | 0.9723 | 0.9714 | 0.9859 |
Per slot evaluation on test set
slot_name | precision | recall | f1 | tc_size |
---|---|---|---|---|
album | 0.7000 | 0.8750 | 0.7778 | 8 |
album_name | 0.9091 | 0.6250 | 0.7407 | 16 |
album_type | 0.1842 | 0.5385 | 0.2745 | 13 |
album_type_1a | 0.0000 | 0.0000 | 0.0000 | 10 |
album_type_an | 0.0000 | 0.0000 | 0.0000 | 20 |
all_lang | 0.5556 | 0.7143 | 0.6250 | 7 |
artist | 0.7500 | 0.7857 | 0.7674 | 42 |
av_alias | 0.8333 | 0.5263 | 0.6452 | 19 |
caption | 0.8065 | 0.7576 | 0.7813 | 33 |
category | 0.8571 | 1.0000 | 0.9231 | 18 |
channel | 0.6786 | 0.8085 | 0.7379 | 47 |
channel_id | 0.7826 | 0.9000 | 0.8372 | 20 |
count | 0.5714 | 1.0000 | 0.7273 | 4 |
date | 0.8333 | 0.7500 | 0.7895 | 40 |
date_day | 1.0000 | 1.0000 | 1.0000 | 4 |
date_month | 1.0000 | 1.0000 | 1.0000 | 8 |
device_name | 0.8621 | 0.7576 | 0.8065 | 33 |
1.0000 | 1.0000 | 1.0000 | 16 | |
event_name | 0.5467 | 0.5325 | 0.5395 | 77 |
file_name | 0.7333 | 0.7857 | 0.7586 | 14 |
file_size | 1.0000 | 1.0000 | 1.0000 | 1 |
filename | 0.7083 | 0.7391 | 0.7234 | 23 |
filter | 0.8333 | 0.9375 | 0.8824 | 16 |
from | 1.0000 | 1.0000 | 1.0000 | 33 |
hashtag | 1.0000 | 0.6000 | 0.7500 | 10 |
img_query | 0.9388 | 0.9246 | 0.9316 | 199 |
label | 0.2500 | 1.0000 | 0.4000 | 1 |
location | 0.8319 | 0.9171 | 0.8724 | 205 |
1.0000 | 1.0000 | 1.0000 | 2 | |
massage | 0.0000 | 0.0000 | 0.0000 | 1 |
mesage | 0.0000 | 0.0000 | 0.0000 | 1 |
message | 0.5856 | 0.5285 | 0.5556 | 123 |
mime_type | 0.6667 | 1.0000 | 0.8000 | 2 |
name | 0.9412 | 0.8767 | 0.9078 | 73 |
pathname | 0.7805 | 0.6809 | 0.7273 | 47 |
percent | 1.0000 | 0.9583 | 0.9787 | 24 |
phone_number | 1.0000 | 1.0000 | 1.0000 | 48 |
phone_type | 1.0000 | 0.9375 | 0.9677 | 16 |
picture_url | 1.0000 | 1.0000 | 1.0000 | 14 |
playlist | 0.7219 | 0.8134 | 0.7649 | 134 |
portal | 0.8108 | 0.7692 | 0.7895 | 39 |
power | 1.0000 | 1.0000 | 1.0000 | 1 |
priority | 0.6667 | 1.0000 | 0.8000 | 2 |
purpose | 1.0000 | 1.0000 | 1.0000 | 8 |
query | 0.6706 | 0.6064 | 0.6369 | 94 |
rating | 0.9167 | 0.9167 | 0.9167 | 12 |
review_count | 0.8750 | 0.7778 | 0.8235 | 9 |
section | 0.9091 | 0.9091 | 0.9091 | 22 |
seek_time | 0.6667 | 1.0000 | 0.8000 | 2 |
sender | 0.6000 | 0.6000 | 0.6000 | 10 |
sender_address | 0.6364 | 0.8750 | 0.7368 | 8 |
song | 0.5476 | 0.6133 | 0.5786 | 75 |
src_lang_de | 0.8765 | 0.9467 | 0.9103 | 75 |
src_lang_en | 0.6604 | 0.6481 | 0.6542 | 54 |
src_lang_es | 0.8132 | 0.9024 | 0.8555 | 82 |
src_lang_fr | 0.8795 | 0.9125 | 0.8957 | 80 |
src_lang_it | 0.8590 | 0.9437 | 0.8993 | 71 |
src_lang_pl | 0.7910 | 0.8833 | 0.8346 | 60 |
state | 1.0000 | 1.0000 | 1.0000 | 1 |
status | 0.5455 | 0.5000 | 0.5217 | 12 |
subject | 0.6154 | 0.5581 | 0.5854 | 86 |
text_de | 0.9091 | 0.9091 | 0.9091 | 77 |
text_en | 0.5909 | 0.5843 | 0.5876 | 89 |
text_es | 0.7935 | 0.8111 | 0.8022 | 90 |
text_esi | 0.0000 | 0.0000 | 0.0000 | 1 |
text_fr | 0.9125 | 0.8588 | 0.8848 | 85 |
text_it | 0.8205 | 0.9014 | 0.8591 | 71 |
text_multi | 0.3333 | 1.0000 | 0.5000 | 1 |
text_pl | 0.8167 | 0.7656 | 0.7903 | 64 |
time | 0.8750 | 1.0000 | 0.9333 | 7 |
to | 0.8927 | 0.9186 | 0.9054 | 172 |
topic | 0.4000 | 0.6667 | 0.5000 | 3 |
translator | 0.7991 | 0.9777 | 0.8794 | 179 |
trg_lang_de | 0.8116 | 0.8615 | 0.8358 | 65 |
trg_lang_en | 0.8841 | 0.8841 | 0.8841 | 69 |
trg_lang_es | 0.8906 | 0.8769 | 0.8837 | 65 |
trg_lang_fr | 0.8676 | 0.9365 | 0.9008 | 63 |
trg_lang_general | 0.8235 | 0.7368 | 0.7778 | 19 |
trg_lang_it | 0.8254 | 0.8667 | 0.8455 | 60 |
trg_lang_pl | 0.8077 | 0.8630 | 0.8344 | 73 |
txt_query | 0.5714 | 0.7059 | 0.6316 | 17 |
username | 0.6875 | 0.7333 | 0.7097 | 15 |
value | 0.7500 | 0.8571 | 0.8000 | 14 |
Framework versions
- Transformers 4.27.4
- Pytorch 1.13.1+cu116
- Datasets 2.11.0
- Tokenizers 0.13.2
Citation
If you use this model, please cite the following:
@inproceedings{kubis2023caiccaic,
author={Marek Kubis and Paweł Skórzewski and Marcin Sowański and Tomasz Ziętkiewicz},
pages={1319–1324},
title={Center for Artificial Intelligence Challenge on Conversational AI Correctness},
booktitle={Proceedings of the 18th Conference on Computer Science and Intelligence Systems},
year={2023},
doi={10.15439/2023B6058},
url={http://dx.doi.org/10.15439/2023B6058},
volume={35},
series={Annals of Computer Science and Information Systems}
}
- Downloads last month
- 21
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.