kuntur-peru-legal-es-gemma-2b-it

This model is a fine-tuned version of google/gemma-2b-it on the generator dataset. It achieves the following results on the evaluation set:

Loss: 1.1387

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2.5e-05
train_batch_size: 4
eval_batch_size: 1
seed: 66
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
3.7041	0.51	50	3.6704
2.5585	1.02	100	2.5245
1.8723	1.53	150	1.9012
1.697	2.05	200	1.6294
1.5123	2.56	250	1.5092
1.3844	3.07	300	1.4406
1.4082	3.58	350	1.3942
1.3473	4.09	400	1.3614
1.2698	4.6	450	1.3338
1.3179	5.12	500	1.3127
1.2776	5.63	550	1.2942
1.2529	6.14	600	1.2781
1.2148	6.65	650	1.2667
1.2378	7.16	700	1.2538
1.1976	7.67	750	1.2418
1.2107	8.18	800	1.2325
1.199	8.7	850	1.2216
1.1498	9.21	900	1.2149
1.1788	9.72	950	1.2059
1.0873	10.23	1000	1.1995
1.1124	10.74	1050	1.1912
1.1161	11.25	1100	1.1858
1.1408	11.76	1150	1.1782
1.083	12.28	1200	1.1735
1.1234	12.79	1250	1.1659
1.1065	13.3	1300	1.1609
1.112	13.81	1350	1.1555
1.0759	14.32	1400	1.1513
1.0783	14.83	1450	1.1462
1.0466	15.35	1500	1.1455
1.0334	15.86	1550	1.1424
1.045	16.37	1600	1.1405
1.016	16.88	1650	1.1393
1.0449	17.39	1700	1.1371
1.0642	17.9	1750	1.1338
1.0276	18.41	1800	1.1340
1.0328	18.93	1850	1.1313
1.0232	19.44	1900	1.1326
1.0588	19.95	1950	1.1284
0.9971	20.46	2000	1.1298
1.0561	20.97	2050	1.1269
1.0714	21.48	2100	1.1279
1.0358	21.99	2150	1.1270
0.9744	22.51	2200	1.1274
1.0019	23.02	2250	1.1275
0.9362	23.53	2300	1.1258
1.0143	24.04	2350	1.1254
1.009	24.55	2400	1.1290
0.9969	25.06	2450	1.1253
0.8828	25.58	2500	1.1256
1.022	26.09	2550	1.1257
0.9804	26.6	2600	1.1265
0.9851	27.11	2650	1.1276
0.9617	27.62	2700	1.1265
0.9346	28.13	2750	1.1263
0.9552	28.64	2800	1.1258
0.9376	29.16	2850	1.1287
0.9359	29.67	2900	1.1262
0.9447	30.18	2950	1.1271
0.9646	30.69	3000	1.1278
0.926	31.2	3050	1.1293
0.9456	31.71	3100	1.1293
0.9223	32.23	3150	1.1296
0.9589	32.74	3200	1.1278
1.0145	33.25	3250	1.1299
0.9315	33.76	3300	1.1292
0.8946	34.27	3350	1.1311
0.9441	34.78	3400	1.1297
0.8996	35.29	3450	1.1317
0.9307	35.81	3500	1.1290
0.9005	36.32	3550	1.1329
0.9167	36.83	3600	1.1303
0.9393	37.34	3650	1.1322
0.9658	37.85	3700	1.1313
0.9375	38.36	3750	1.1341
0.9176	38.87	3800	1.1326
0.8982	39.39	3850	1.1351
0.9685	39.9	3900	1.1326
0.9216	40.41	3950	1.1355
0.9542	40.92	4000	1.1342
0.8739	41.43	4050	1.1371
0.9329	41.94	4100	1.1355
0.9335	42.46	4150	1.1354
0.8851	42.97	4200	1.1363
0.9217	43.48	4250	1.1377
0.8794	43.99	4300	1.1363
0.9104	44.5	4350	1.1371
0.8751	45.01	4400	1.1367
0.9157	45.52	4450	1.1377
0.8277	46.04	4500	1.1374
0.8858	46.55	4550	1.1384
0.9195	47.06	4600	1.1378
0.925	47.57	4650	1.1383
0.9007	48.08	4700	1.1384
0.9184	48.59	4750	1.1385
0.8798	49.1	4800	1.1385
0.8596	49.62	4850	1.1387

Framework versions

PEFT 0.10.0
Transformers 4.38.0
Pytorch 2.2.2+cu121
Datasets 2.18.0
Tokenizers 0.15.2

daqc
/

kuntur-peru-legal-es-gemma-2b-it

kuntur-peru-legal-es-gemma-2b-it

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for daqc/kuntur-peru-legal-es-gemma-2b-it

Space using daqc/kuntur-peru-legal-es-gemma-2b-it 1

Evaluation results