kuntur-peru-legal-es-gemma-2b-it

This model is a fine-tuned version of google/gemma-2b-it on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1387

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-05
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 66
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.7041 0.51 50 3.6704
2.5585 1.02 100 2.5245
1.8723 1.53 150 1.9012
1.697 2.05 200 1.6294
1.5123 2.56 250 1.5092
1.3844 3.07 300 1.4406
1.4082 3.58 350 1.3942
1.3473 4.09 400 1.3614
1.2698 4.6 450 1.3338
1.3179 5.12 500 1.3127
1.2776 5.63 550 1.2942
1.2529 6.14 600 1.2781
1.2148 6.65 650 1.2667
1.2378 7.16 700 1.2538
1.1976 7.67 750 1.2418
1.2107 8.18 800 1.2325
1.199 8.7 850 1.2216
1.1498 9.21 900 1.2149
1.1788 9.72 950 1.2059
1.0873 10.23 1000 1.1995
1.1124 10.74 1050 1.1912
1.1161 11.25 1100 1.1858
1.1408 11.76 1150 1.1782
1.083 12.28 1200 1.1735
1.1234 12.79 1250 1.1659
1.1065 13.3 1300 1.1609
1.112 13.81 1350 1.1555
1.0759 14.32 1400 1.1513
1.0783 14.83 1450 1.1462
1.0466 15.35 1500 1.1455
1.0334 15.86 1550 1.1424
1.045 16.37 1600 1.1405
1.016 16.88 1650 1.1393
1.0449 17.39 1700 1.1371
1.0642 17.9 1750 1.1338
1.0276 18.41 1800 1.1340
1.0328 18.93 1850 1.1313
1.0232 19.44 1900 1.1326
1.0588 19.95 1950 1.1284
0.9971 20.46 2000 1.1298
1.0561 20.97 2050 1.1269
1.0714 21.48 2100 1.1279
1.0358 21.99 2150 1.1270
0.9744 22.51 2200 1.1274
1.0019 23.02 2250 1.1275
0.9362 23.53 2300 1.1258
1.0143 24.04 2350 1.1254
1.009 24.55 2400 1.1290
0.9969 25.06 2450 1.1253
0.8828 25.58 2500 1.1256
1.022 26.09 2550 1.1257
0.9804 26.6 2600 1.1265
0.9851 27.11 2650 1.1276
0.9617 27.62 2700 1.1265
0.9346 28.13 2750 1.1263
0.9552 28.64 2800 1.1258
0.9376 29.16 2850 1.1287
0.9359 29.67 2900 1.1262
0.9447 30.18 2950 1.1271
0.9646 30.69 3000 1.1278
0.926 31.2 3050 1.1293
0.9456 31.71 3100 1.1293
0.9223 32.23 3150 1.1296
0.9589 32.74 3200 1.1278
1.0145 33.25 3250 1.1299
0.9315 33.76 3300 1.1292
0.8946 34.27 3350 1.1311
0.9441 34.78 3400 1.1297
0.8996 35.29 3450 1.1317
0.9307 35.81 3500 1.1290
0.9005 36.32 3550 1.1329
0.9167 36.83 3600 1.1303
0.9393 37.34 3650 1.1322
0.9658 37.85 3700 1.1313
0.9375 38.36 3750 1.1341
0.9176 38.87 3800 1.1326
0.8982 39.39 3850 1.1351
0.9685 39.9 3900 1.1326
0.9216 40.41 3950 1.1355
0.9542 40.92 4000 1.1342
0.8739 41.43 4050 1.1371
0.9329 41.94 4100 1.1355
0.9335 42.46 4150 1.1354
0.8851 42.97 4200 1.1363
0.9217 43.48 4250 1.1377
0.8794 43.99 4300 1.1363
0.9104 44.5 4350 1.1371
0.8751 45.01 4400 1.1367
0.9157 45.52 4450 1.1377
0.8277 46.04 4500 1.1374
0.8858 46.55 4550 1.1384
0.9195 47.06 4600 1.1378
0.925 47.57 4650 1.1383
0.9007 48.08 4700 1.1384
0.9184 48.59 4750 1.1385
0.8798 49.1 4800 1.1385
0.8596 49.62 4850 1.1387

Framework versions

  • PEFT 0.10.0
  • Transformers 4.38.0
  • Pytorch 2.2.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
10
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for daqc/kuntur-peru-legal-es-gemma-2b-it

Base model

google/gemma-2b-it
Adapter
(542)
this model

Space using daqc/kuntur-peru-legal-es-gemma-2b-it 1