peft-starcoder-finetuned

This model is a fine-tuned version of bigcode/starcoderbase-1b on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-06
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 20
training_steps: 1000

Training Loss	Epoch	Step	Validation Loss
1.0733	0.1631	20	0.9622
1.0649	0.3262	40	0.9528
1.0324	0.4893	60	0.9462
1.0216	0.6524	80	0.9424
1.0067	0.8155	100	0.9368
0.9977	0.9786	120	0.9329
0.97	1.1458	140	0.9302
0.9085	1.3089	160	0.9279
0.934	1.4720	180	0.9233
1.0061	1.6351	200	0.9184
0.9564	1.7982	220	0.9165
0.9738	1.9613	240	0.9126
0.8864	2.1284	260	0.9114
0.9144	2.2915	280	0.9113
0.9443	2.4546	300	0.9098
0.9444	2.6177	320	0.9083
0.887	2.7808	340	0.9058
0.9398	2.9439	360	0.9052
0.9015	3.1111	380	0.9031
0.8536	3.2742	400	0.9024
0.8765	3.4373	420	0.9002
0.9198	3.6004	440	0.8997
0.9468	3.7635	460	0.8989
0.8631	3.9266	480	0.8978
0.8777	4.0938	500	0.8977
0.9006	4.2569	520	0.8959
0.8768	4.4200	540	0.8957
0.8477	4.5831	560	0.8951
0.9061	4.7462	580	0.8937
0.8837	4.9093	600	0.8930
0.8402	5.0765	620	0.8939
0.8608	5.2396	640	0.8931
0.879	5.4027	660	0.8928
0.8562	5.5657	680	0.8922
0.8776	5.7288	700	0.8913
0.8464	5.8919	720	0.8910
0.8528	6.0591	740	0.8914
0.8538	6.2222	760	0.8910
0.8844	6.3853	780	0.8905
0.8652	6.5484	800	0.8906
0.8443	6.7115	820	0.8905
0.8546	6.8746	840	0.8899
0.8094	7.0418	860	0.8904
0.863	7.2049	880	0.8899
0.8642	7.3680	900	0.8902
0.8413	7.5311	920	0.8901
0.8119	7.6942	940	0.8903
0.8909	7.8573	960	0.8901
0.8516	8.0245	980	0.8900
0.8834	8.1876	1000	0.8901