scenario-kd-po-ner-full-mdeberta_data-univner_full55

This model is a fine-tuned version of haryoaw/scenario-TCR-NER_data-univner_full on the None dataset. It achieves the following results on the evaluation set:

Loss: 46.9305
Precision: 0.8196
Recall: 0.8292
F1: 0.8244
Accuracy: 0.9823

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 8
eval_batch_size: 32
seed: 55
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
135.8523	0.2911	500	108.5680	0.5906	0.3842	0.4656	0.9517
100.5734	0.5822	1000	94.6790	0.7351	0.6641	0.6978	0.9712
91.2154	0.8732	1500	88.0640	0.7607	0.7570	0.7588	0.9762
85.164	1.1643	2000	83.3722	0.8082	0.7302	0.7672	0.9767
80.2474	1.4554	2500	78.9752	0.7780	0.7925	0.7852	0.9791
76.4386	1.7465	3000	75.6330	0.8035	0.7945	0.7990	0.9801
73.0828	2.0375	3500	72.4945	0.7997	0.7970	0.7983	0.9804
69.7284	2.3286	4000	69.7705	0.7983	0.8048	0.8016	0.9804
67.0314	2.6197	4500	67.3742	0.8113	0.7970	0.8041	0.9805
64.9596	2.9108	5000	65.2223	0.8108	0.8025	0.8066	0.9805
62.6221	3.2019	5500	63.1795	0.8049	0.8169	0.8109	0.9810
60.6361	3.4929	6000	61.4200	0.8124	0.8186	0.8155	0.9814
58.8661	3.7840	6500	59.9772	0.8102	0.8192	0.8147	0.9815
57.5058	4.0751	7000	58.4410	0.8114	0.8168	0.8141	0.9811
55.9259	4.3662	7500	57.1486	0.8151	0.8179	0.8165	0.9814
54.6494	4.6573	8000	55.9362	0.8206	0.8155	0.8180	0.9814
53.5407	4.9483	8500	54.8810	0.8152	0.8205	0.8179	0.9816
52.3581	5.2394	9000	53.9021	0.8169	0.8266	0.8217	0.9816
51.3581	5.5305	9500	53.0325	0.8200	0.8204	0.8202	0.9816
50.5535	5.8216	10000	52.1425	0.8182	0.8282	0.8232	0.9818
49.8392	6.1126	10500	51.4247	0.8178	0.8254	0.8216	0.9817
48.9716	6.4037	11000	50.6978	0.8191	0.8338	0.8264	0.9823
48.3296	6.6948	11500	50.1578	0.8164	0.8290	0.8227	0.9818
47.712	6.9859	12000	49.5760	0.8234	0.8266	0.8250	0.9824
47.0545	7.2770	12500	49.0523	0.8227	0.8354	0.8290	0.9821
46.6326	7.5680	13000	48.6282	0.8174	0.8287	0.8230	0.9820
46.2306	7.8591	13500	48.2713	0.8208	0.8254	0.8231	0.9819
45.9118	8.1502	14000	47.9235	0.8185	0.8259	0.8222	0.9817
45.5272	8.4413	14500	47.6086	0.8241	0.8259	0.8250	0.9822
45.2228	8.7324	15000	47.3476	0.8250	0.8321	0.8285	0.9822
44.9978	9.0234	15500	47.1635	0.8204	0.8263	0.8233	0.9821
44.8309	9.3145	16000	47.0839	0.8264	0.8285	0.8274	0.9821
44.6998	9.6056	16500	46.9565	0.8228	0.8292	0.8260	0.9824
44.6759	9.8967	17000	46.9305	0.8196	0.8292	0.8244	0.9823

Framework versions

Transformers 4.44.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.19.1

haryoaw
/

scenario-kd-po-ner-full-mdeberta_data-univner_full55

scenario-kd-po-ner-full-mdeberta_data-univner_full55

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-kd-po-ner-full-mdeberta_data-univner_full55

Evaluation results