peft-starcoder-finetuned-cpp

This model is a fine-tuned version of bigcode/starcoderbase-1b on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-06
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 20
training_steps: 1000

Training Loss	Epoch	Step	Validation Loss
0.9707	0.1273	20	0.9157
0.9027	0.2546	40	0.8960
0.7973	0.3819	60	0.8843
0.7716	0.5091	80	0.8618
0.6858	0.6364	100	0.8392
0.6603	0.7637	120	0.8126
0.6288	0.8910	140	0.7950
0.5693	1.0183	160	0.7798
0.5035	1.1456	180	0.7706
0.5376	1.2729	200	0.7583
0.4893	1.4002	220	0.7469
0.5256	1.5274	240	0.7366
0.4646	1.6547	260	0.7262
0.5039	1.7820	280	0.7156
0.4376	1.9093	300	0.7062
0.4262	2.0366	320	0.7000
0.445	2.1639	340	0.6917
0.4307	2.2912	360	0.6847
0.4531	2.4185	380	0.6822
0.4018	2.5457	400	0.6758
0.4466	2.6730	420	0.6695
0.3934	2.8003	440	0.6649
0.3815	2.9276	460	0.6607
0.3834	3.0549	480	0.6575
0.4001	3.1822	500	0.6523
0.398	3.3095	520	0.6481
0.3824	3.4368	540	0.6453
0.3756	3.5640	560	0.6409
0.3843	3.6913	580	0.6382
0.3829	3.8186	600	0.6357
0.3534	3.9459	620	0.6345
0.4136	4.0732	640	0.6343
0.3409	4.2005	660	0.6321
0.357	4.3278	680	0.6288
0.397	4.4551	700	0.6263
0.3713	4.5823	720	0.6255
0.3914	4.7096	740	0.6242
0.3657	4.8369	760	0.6230
0.3711	4.9642	780	0.6216
0.3538	5.0915	800	0.6205
0.377	5.2188	820	0.6199
0.3426	5.3461	840	0.6194
0.3583	5.4733	860	0.6188
0.3643	5.6006	880	0.6180
0.362	5.7279	900	0.6178
0.388	5.8552	920	0.6177
0.3424	5.9825	940	0.6177
0.3614	6.1098	960	0.6176
0.3652	6.2371	980	0.6176
0.3691	6.3644	1000	0.6176