Edit model card

fedcsis-slot_baseline-xlm_r-en

This model is a fine-tuned version of xlm-roberta-base on the leyzer-fedcsis dataset.

Results on test set:

  • Precision: 0.7767
  • Recall: 0.7991
  • F1: 0.7877
  • Accuracy: 0.9425

It achieves the following results on the evaluation set:

  • Loss: 0.1097
  • Precision: 0.9705
  • Recall: 0.9723
  • F1: 0.9714
  • Accuracy: 0.9859

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
1.2866 1.0 814 0.3188 0.8661 0.8672 0.8666 0.9250
0.1956 2.0 1628 0.1299 0.9409 0.9471 0.9440 0.9736
0.1063 3.0 2442 0.1196 0.9537 0.9607 0.9572 0.9810
0.0558 4.0 3256 0.0789 0.9661 0.9697 0.9679 0.9854
0.0367 5.0 4070 0.0824 0.9685 0.9690 0.9687 0.9848
0.031 6.0 4884 0.0887 0.9712 0.9728 0.9720 0.9859
0.0233 7.0 5698 0.0829 0.9736 0.9744 0.9740 0.9872
0.0139 8.0 6512 0.0879 0.9743 0.9747 0.9745 0.9876
0.007 9.0 7326 0.0978 0.9740 0.9734 0.9737 0.9870
0.0076 10.0 8140 0.1015 0.9723 0.9726 0.9725 0.9860
0.026 11.0 814 0.1264 0.9732 0.9620 0.9676 0.9829
0.0189 12.0 1628 0.0975 0.9732 0.9711 0.9722 0.9861
0.0099 13.0 2442 0.1080 0.9721 0.9715 0.9718 0.9866
0.0052 14.0 3256 0.1052 0.9706 0.9715 0.9710 0.9860
0.0031 15.0 4070 0.1097 0.9705 0.9723 0.9714 0.9859

Per slot evaluation on test set

slot_name precision recall f1 tc_size
album 0.7000 0.8750 0.7778 8
album_name 0.9091 0.6250 0.7407 16
album_type 0.1842 0.5385 0.2745 13
album_type_1a 0.0000 0.0000 0.0000 10
album_type_an 0.0000 0.0000 0.0000 20
all_lang 0.5556 0.7143 0.6250 7
artist 0.7500 0.7857 0.7674 42
av_alias 0.8333 0.5263 0.6452 19
caption 0.8065 0.7576 0.7813 33
category 0.8571 1.0000 0.9231 18
channel 0.6786 0.8085 0.7379 47
channel_id 0.7826 0.9000 0.8372 20
count 0.5714 1.0000 0.7273 4
date 0.8333 0.7500 0.7895 40
date_day 1.0000 1.0000 1.0000 4
date_month 1.0000 1.0000 1.0000 8
device_name 0.8621 0.7576 0.8065 33
email 1.0000 1.0000 1.0000 16
event_name 0.5467 0.5325 0.5395 77
file_name 0.7333 0.7857 0.7586 14
file_size 1.0000 1.0000 1.0000 1
filename 0.7083 0.7391 0.7234 23
filter 0.8333 0.9375 0.8824 16
from 1.0000 1.0000 1.0000 33
hashtag 1.0000 0.6000 0.7500 10
img_query 0.9388 0.9246 0.9316 199
label 0.2500 1.0000 0.4000 1
location 0.8319 0.9171 0.8724 205
mail 1.0000 1.0000 1.0000 2
massage 0.0000 0.0000 0.0000 1
mesage 0.0000 0.0000 0.0000 1
message 0.5856 0.5285 0.5556 123
mime_type 0.6667 1.0000 0.8000 2
name 0.9412 0.8767 0.9078 73
pathname 0.7805 0.6809 0.7273 47
percent 1.0000 0.9583 0.9787 24
phone_number 1.0000 1.0000 1.0000 48
phone_type 1.0000 0.9375 0.9677 16
picture_url 1.0000 1.0000 1.0000 14
playlist 0.7219 0.8134 0.7649 134
portal 0.8108 0.7692 0.7895 39
power 1.0000 1.0000 1.0000 1
priority 0.6667 1.0000 0.8000 2
purpose 1.0000 1.0000 1.0000 8
query 0.6706 0.6064 0.6369 94
rating 0.9167 0.9167 0.9167 12
review_count 0.8750 0.7778 0.8235 9
section 0.9091 0.9091 0.9091 22
seek_time 0.6667 1.0000 0.8000 2
sender 0.6000 0.6000 0.6000 10
sender_address 0.6364 0.8750 0.7368 8
song 0.5476 0.6133 0.5786 75
src_lang_de 0.8765 0.9467 0.9103 75
src_lang_en 0.6604 0.6481 0.6542 54
src_lang_es 0.8132 0.9024 0.8555 82
src_lang_fr 0.8795 0.9125 0.8957 80
src_lang_it 0.8590 0.9437 0.8993 71
src_lang_pl 0.7910 0.8833 0.8346 60
state 1.0000 1.0000 1.0000 1
status 0.5455 0.5000 0.5217 12
subject 0.6154 0.5581 0.5854 86
text_de 0.9091 0.9091 0.9091 77
text_en 0.5909 0.5843 0.5876 89
text_es 0.7935 0.8111 0.8022 90
text_esi 0.0000 0.0000 0.0000 1
text_fr 0.9125 0.8588 0.8848 85
text_it 0.8205 0.9014 0.8591 71
text_multi 0.3333 1.0000 0.5000 1
text_pl 0.8167 0.7656 0.7903 64
time 0.8750 1.0000 0.9333 7
to 0.8927 0.9186 0.9054 172
topic 0.4000 0.6667 0.5000 3
translator 0.7991 0.9777 0.8794 179
trg_lang_de 0.8116 0.8615 0.8358 65
trg_lang_en 0.8841 0.8841 0.8841 69
trg_lang_es 0.8906 0.8769 0.8837 65
trg_lang_fr 0.8676 0.9365 0.9008 63
trg_lang_general 0.8235 0.7368 0.7778 19
trg_lang_it 0.8254 0.8667 0.8455 60
trg_lang_pl 0.8077 0.8630 0.8344 73
txt_query 0.5714 0.7059 0.6316 17
username 0.6875 0.7333 0.7097 15
value 0.7500 0.8571 0.8000 14

Framework versions

  • Transformers 4.27.4
  • Pytorch 1.13.1+cu116
  • Datasets 2.11.0
  • Tokenizers 0.13.2

Citation

If you use this model, please cite the following:

@inproceedings{kubis2023caiccaic,
    author={Marek Kubis and Paweł Skórzewski and Marcin Sowański and Tomasz Ziętkiewicz},
    pages={1319–1324},
    title={Center for Artificial Intelligence Challenge on Conversational AI Correctness},
    booktitle={Proceedings of the 18th Conference on Computer Science and Intelligence Systems},
    year={2023},
    doi={10.15439/2023B6058},
    url={http://dx.doi.org/10.15439/2023B6058},
    volume={35},
    series={Annals of Computer Science and Information Systems}
}
Downloads last month
21
Safetensors
Model size
278M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train cartesinus/fedcsis-slot_baseline-xlm_r-en