v3-my_awesome

This model is a fine-tuned version of Patcas/plbart-works on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.4256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	165	1.0474
No log	2.0	330	1.0274
No log	3.0	495	1.0536
0.2458	4.0	660	1.0316
0.2458	5.0	825	1.0409
0.2458	6.0	990	1.0534
0.1408	7.0	1155	1.0838
0.1408	8.0	1320	1.0757
0.1408	9.0	1485	1.1114
0.0813	10.0	1650	1.1037
0.0813	11.0	1815	1.0990
0.0813	12.0	1980	1.1385
0.0514	13.0	2145	1.1595
0.0514	14.0	2310	1.1591
0.0514	15.0	2475	1.1526
0.0358	16.0	2640	1.1712
0.0358	17.0	2805	1.1831
0.0358	18.0	2970	1.1991
0.027	19.0	3135	1.1804
0.027	20.0	3300	1.1840
0.027	21.0	3465	1.2039
0.0231	22.0	3630	1.2017
0.0231	23.0	3795	1.2293
0.0231	24.0	3960	1.2377
0.0182	25.0	4125	1.2383
0.0182	26.0	4290	1.2409
0.0182	27.0	4455	1.2399
0.0138	28.0	4620	1.2400
0.0138	29.0	4785	1.2569
0.0138	30.0	4950	1.2861
0.0102	31.0	5115	1.2626
0.0102	32.0	5280	1.2841
0.0102	33.0	5445	1.2767
0.0088	34.0	5610	1.2558
0.0088	35.0	5775	1.2666
0.0088	36.0	5940	1.2852
0.0088	37.0	6105	1.2958
0.0088	38.0	6270	1.3174
0.0088	39.0	6435	1.2938
0.0099	40.0	6600	1.3063
0.0099	41.0	6765	1.2998
0.0099	42.0	6930	1.3176
0.0078	43.0	7095	1.3139
0.0078	44.0	7260	1.2946
0.0078	45.0	7425	1.3100
0.0068	46.0	7590	1.3153
0.0068	47.0	7755	1.3185
0.0068	48.0	7920	1.3339
0.0063	49.0	8085	1.3284
0.0063	50.0	8250	1.3353
0.0063	51.0	8415	1.3271
0.0045	52.0	8580	1.3470
0.0045	53.0	8745	1.3348
0.0045	54.0	8910	1.3485
0.0038	55.0	9075	1.3368
0.0038	56.0	9240	1.3429
0.0038	57.0	9405	1.3564
0.0041	58.0	9570	1.3642
0.0041	59.0	9735	1.3657
0.0041	60.0	9900	1.3540
0.0033	61.0	10065	1.3671
0.0033	62.0	10230	1.3632
0.0033	63.0	10395	1.3698
0.0029	64.0	10560	1.3805
0.0029	65.0	10725	1.3878
0.0029	66.0	10890	1.3864
0.0026	67.0	11055	1.3906
0.0026	68.0	11220	1.3981
0.0026	69.0	11385	1.3931
0.0027	70.0	11550	1.3868
0.0027	71.0	11715	1.3873
0.0027	72.0	11880	1.3857
0.0025	73.0	12045	1.3879
0.0025	74.0	12210	1.3871
0.0025	75.0	12375	1.3937
0.002	76.0	12540	1.4003
0.002	77.0	12705	1.4048
0.002	78.0	12870	1.4056
0.0022	79.0	13035	1.4074
0.0022	80.0	13200	1.4064
0.0022	81.0	13365	1.4059
0.0016	82.0	13530	1.4160
0.0016	83.0	13695	1.4078
0.0016	84.0	13860	1.4132
0.0015	85.0	14025	1.4119
0.0015	86.0	14190	1.4147
0.0015	87.0	14355	1.4131
0.0014	88.0	14520	1.4131
0.0014	89.0	14685	1.4118
0.0014	90.0	14850	1.4152
0.0013	91.0	15015	1.4211
0.0013	92.0	15180	1.4213
0.0013	93.0	15345	1.4238
0.0012	94.0	15510	1.4222
0.0012	95.0	15675	1.4246
0.0012	96.0	15840	1.4247
0.0011	97.0	16005	1.4261
0.0011	98.0	16170	1.4259
0.0011	99.0	16335	1.4255
0.0011	100.0	16500	1.4256

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu121
Datasets 2.15.0
Tokenizers 0.15.0

Patcas
/

v3-my_awesome

v3-my_awesome

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Patcas/v3-my_awesome

Evaluation results