fedcsis-slot_baseline-xlm_r-en

This model is a fine-tuned version of xlm-roberta-base on the leyzer-fedcsis dataset.

Results on test set:

Precision: 0.7767
Recall: 0.7991
F1: 0.7877
Accuracy: 0.9425

It achieves the following results on the evaluation set:

Loss: 0.1097
Precision: 0.9705
Recall: 0.9723
F1: 0.9714
Accuracy: 0.9859

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
1.2866	1.0	814	0.3188	0.8661	0.8672	0.8666	0.9250
0.1956	2.0	1628	0.1299	0.9409	0.9471	0.9440	0.9736
0.1063	3.0	2442	0.1196	0.9537	0.9607	0.9572	0.9810
0.0558	4.0	3256	0.0789	0.9661	0.9697	0.9679	0.9854
0.0367	5.0	4070	0.0824	0.9685	0.9690	0.9687	0.9848
0.031	6.0	4884	0.0887	0.9712	0.9728	0.9720	0.9859
0.0233	7.0	5698	0.0829	0.9736	0.9744	0.9740	0.9872
0.0139	8.0	6512	0.0879	0.9743	0.9747	0.9745	0.9876
0.007	9.0	7326	0.0978	0.9740	0.9734	0.9737	0.9870
0.0076	10.0	8140	0.1015	0.9723	0.9726	0.9725	0.9860
0.026	11.0	814	0.1264	0.9732	0.9620	0.9676	0.9829
0.0189	12.0	1628	0.0975	0.9732	0.9711	0.9722	0.9861
0.0099	13.0	2442	0.1080	0.9721	0.9715	0.9718	0.9866
0.0052	14.0	3256	0.1052	0.9706	0.9715	0.9710	0.9860
0.0031	15.0	4070	0.1097	0.9705	0.9723	0.9714	0.9859

Per slot evaluation on test set

slot_name	precision	recall	f1	tc_size
album	0.7000	0.8750	0.7778	8
album_name	0.9091	0.6250	0.7407	16
album_type	0.1842	0.5385	0.2745	13
album_type_1a	0.0000	0.0000	0.0000	10
album_type_an	0.0000	0.0000	0.0000	20
all_lang	0.5556	0.7143	0.6250	7
artist	0.7500	0.7857	0.7674	42
av_alias	0.8333	0.5263	0.6452	19
caption	0.8065	0.7576	0.7813	33
category	0.8571	1.0000	0.9231	18
channel	0.6786	0.8085	0.7379	47
channel_id	0.7826	0.9000	0.8372	20
count	0.5714	1.0000	0.7273	4
date	0.8333	0.7500	0.7895	40
date_day	1.0000	1.0000	1.0000	4
date_month	1.0000	1.0000	1.0000	8
device_name	0.8621	0.7576	0.8065	33
email	1.0000	1.0000	1.0000	16
event_name	0.5467	0.5325	0.5395	77
file_name	0.7333	0.7857	0.7586	14
file_size	1.0000	1.0000	1.0000	1
filename	0.7083	0.7391	0.7234	23
filter	0.8333	0.9375	0.8824	16
from	1.0000	1.0000	1.0000	33
hashtag	1.0000	0.6000	0.7500	10
img_query	0.9388	0.9246	0.9316	199
label	0.2500	1.0000	0.4000	1
location	0.8319	0.9171	0.8724	205
mail	1.0000	1.0000	1.0000	2
massage	0.0000	0.0000	0.0000	1
mesage	0.0000	0.0000	0.0000	1
message	0.5856	0.5285	0.5556	123
mime_type	0.6667	1.0000	0.8000	2
name	0.9412	0.8767	0.9078	73
pathname	0.7805	0.6809	0.7273	47
percent	1.0000	0.9583	0.9787	24
phone_number	1.0000	1.0000	1.0000	48
phone_type	1.0000	0.9375	0.9677	16
picture_url	1.0000	1.0000	1.0000	14
playlist	0.7219	0.8134	0.7649	134
portal	0.8108	0.7692	0.7895	39
power	1.0000	1.0000	1.0000	1
priority	0.6667	1.0000	0.8000	2
purpose	1.0000	1.0000	1.0000	8
query	0.6706	0.6064	0.6369	94
rating	0.9167	0.9167	0.9167	12
review_count	0.8750	0.7778	0.8235	9
section	0.9091	0.9091	0.9091	22
seek_time	0.6667	1.0000	0.8000	2
sender	0.6000	0.6000	0.6000	10
sender_address	0.6364	0.8750	0.7368	8
song	0.5476	0.6133	0.5786	75
src_lang_de	0.8765	0.9467	0.9103	75
src_lang_en	0.6604	0.6481	0.6542	54
src_lang_es	0.8132	0.9024	0.8555	82
src_lang_fr	0.8795	0.9125	0.8957	80
src_lang_it	0.8590	0.9437	0.8993	71
src_lang_pl	0.7910	0.8833	0.8346	60
state	1.0000	1.0000	1.0000	1
status	0.5455	0.5000	0.5217	12
subject	0.6154	0.5581	0.5854	86
text_de	0.9091	0.9091	0.9091	77
text_en	0.5909	0.5843	0.5876	89
text_es	0.7935	0.8111	0.8022	90
text_esi	0.0000	0.0000	0.0000	1
text_fr	0.9125	0.8588	0.8848	85
text_it	0.8205	0.9014	0.8591	71
text_multi	0.3333	1.0000	0.5000	1
text_pl	0.8167	0.7656	0.7903	64
time	0.8750	1.0000	0.9333	7
to	0.8927	0.9186	0.9054	172
topic	0.4000	0.6667	0.5000	3
translator	0.7991	0.9777	0.8794	179
trg_lang_de	0.8116	0.8615	0.8358	65
trg_lang_en	0.8841	0.8841	0.8841	69
trg_lang_es	0.8906	0.8769	0.8837	65
trg_lang_fr	0.8676	0.9365	0.9008	63
trg_lang_general	0.8235	0.7368	0.7778	19
trg_lang_it	0.8254	0.8667	0.8455	60
trg_lang_pl	0.8077	0.8630	0.8344	73
txt_query	0.5714	0.7059	0.6316	17
username	0.6875	0.7333	0.7097	15
value	0.7500	0.8571	0.8000	14

Framework versions

Transformers 4.27.4
Pytorch 1.13.1+cu116
Datasets 2.11.0
Tokenizers 0.13.2

Citation

If you use this model, please cite the following:

@inproceedings{kubis2023caiccaic,
    author={Marek Kubis and Paweł Skórzewski and Marcin Sowański and Tomasz Ziętkiewicz},
    pages={1319–1324},
    title={Center for Artificial Intelligence Challenge on Conversational AI Correctness},
    booktitle={Proceedings of the 18th Conference on Computer Science and Intelligence Systems},
    year={2023},
    doi={10.15439/2023B6058},
    url={http://dx.doi.org/10.15439/2023B6058},
    volume={35},
    series={Annals of Computer Science and Information Systems}
}

cartesinus
/

fedcsis-slot_baseline-xlm_r-en

fedcsis-slot_baseline-xlm_r-en

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Per slot evaluation on test set

Framework versions

Citation

Dataset used to train cartesinus/fedcsis-slot_baseline-xlm_r-en

Evaluation results