Edit model card

results_1024_hard

This model is a fine-tuned version of google/gemma-2b-it on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7783

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 3
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.5619 0.0390 10 3.6739
3.5429 0.0780 20 3.6357
3.5082 0.1170 30 3.5419
3.5337 0.1560 40 3.3450
3.4677 0.1950 50 3.0606
2.9464 0.2340 60 2.8547
2.7669 0.2730 70 2.6702
2.6965 0.3120 80 2.5309
2.5909 0.3510 90 2.4184
2.4393 0.3901 100 2.3441
2.4496 0.4291 110 2.3040
2.4013 0.4681 120 2.2693
2.3979 0.5071 130 2.2463
2.3603 0.5461 140 2.2143
2.221 0.5851 150 2.1913
2.3495 0.6241 160 2.1816
2.3647 0.6631 170 2.1766
2.3218 0.7021 180 2.1577
2.3126 0.7411 190 2.1344
2.1452 0.7801 200 2.1263
2.3334 0.8191 210 2.1238
2.2701 0.8581 220 2.1139
2.2801 0.8971 230 2.1050
2.2627 0.9361 240 2.0848
2.0111 0.9751 250 2.0796
2.2442 1.0141 260 2.0675
2.2478 1.0531 270 2.0694
2.282 1.0922 280 2.0713
2.2581 1.1312 290 2.0573
2.1635 1.1702 300 2.0495
2.004 1.2092 310 2.0506
2.191 1.2482 320 2.0578
2.1744 1.2872 330 2.0493
2.1702 1.3262 340 2.0327
2.1859 1.3652 350 2.0240
1.9934 1.4042 360 2.0287
2.2299 1.4432 370 2.0442
2.1927 1.4822 380 2.0247
2.1494 1.5212 390 2.0049
2.1116 1.5602 400 2.0000
1.9836 1.5992 410 2.0000
2.1745 1.6382 420 2.0106
2.1316 1.6772 430 2.0039
2.1532 1.7162 440 1.9822
2.0582 1.7552 450 1.9795
1.9695 1.7942 460 1.9824
2.1855 1.8333 470 2.0026
2.176 1.8723 480 1.9842
2.1188 1.9113 490 1.9598
2.0718 1.9503 500 1.9515
1.8728 1.9893 510 1.9540
2.0879 2.0283 520 1.9532
2.119 2.0673 530 1.9605
2.1424 2.1063 540 1.9559
2.0755 2.1453 550 1.9306
1.9815 2.1843 560 1.9407
1.952 2.2233 570 1.9472
2.1086 2.2623 580 1.9460
2.1537 2.3013 590 1.9390
2.0626 2.3403 600 1.9184
1.9718 2.3793 610 1.9251
1.9661 2.4183 620 1.9346
2.0937 2.4573 630 1.9418
2.1033 2.4963 640 1.9311
1.9948 2.5353 650 1.9051
1.9327 2.5744 660 1.9142
1.9565 2.6134 670 1.9182
2.0608 2.6524 680 1.9238
2.0804 2.6914 690 1.9166
2.0171 2.7304 700 1.8911
1.9227 2.7694 710 1.8917
1.8928 2.8084 720 1.8967
2.1007 2.8474 730 1.9101
2.0644 2.8864 740 1.8979
2.0158 2.9254 750 1.8790
1.8893 2.9644 760 1.8889
1.8667 3.0034 770 1.8789
2.0376 3.0424 780 1.8826
2.0067 3.0814 790 1.8921
2.0115 3.1204 800 1.8781
1.9622 3.1594 810 1.8691
1.6909 3.1984 820 1.8865
2.0386 3.2374 830 1.8937
2.0207 3.2765 840 1.8961
2.0297 3.3155 850 1.8736
1.9413 3.3545 860 1.8623
1.7372 3.3935 870 1.8927
2.0537 3.4325 880 1.8809
2.0477 3.4715 890 1.8878
2.0275 3.5105 900 1.8757
1.9251 3.5495 910 1.8551
1.7597 3.5885 920 1.8677
2.0448 3.6275 930 1.8671
2.0844 3.6665 940 1.8749
2.0147 3.7055 950 1.8612
1.9194 3.7445 960 1.8494
1.7378 3.7835 970 1.8554
2.029 3.8225 980 1.8591
2.0434 3.8615 990 1.8658
1.9697 3.9005 1000 1.8546
1.9241 3.9395 1010 1.8430
1.7295 3.9785 1020 1.8510
2.0073 4.0176 1030 1.8400
1.9937 4.0566 1040 1.8487
2.0309 4.0956 1050 1.8506
1.9177 4.1346 1060 1.8354
1.868 4.1736 1070 1.8401
1.7374 4.2126 1080 1.8390
2.0335 4.2516 1090 1.8501
1.9698 4.2906 1100 1.8499
1.896 4.3296 1110 1.8350
1.8848 4.3686 1120 1.8367
1.7259 4.4076 1130 1.8355
1.9634 4.4466 1140 1.8485
1.9683 4.4856 1150 1.8451
1.9561 4.5246 1160 1.8269
1.8939 4.5636 1170 1.8320
1.748 4.6026 1180 1.8308
1.98 4.6416 1190 1.8427
1.9592 4.6806 1200 1.8391
1.9849 4.7196 1210 1.8259
1.8567 4.7587 1220 1.8243
1.7734 4.7977 1230 1.8267
2.0098 4.8367 1240 1.8420
1.9283 4.8757 1250 1.8364
1.9114 4.9147 1260 1.8182
1.83 4.9537 1270 1.8210
1.6698 4.9927 1280 1.8231
1.8952 5.0317 1290 1.8315
1.97 5.0707 1300 1.8474
1.895 5.1097 1310 1.8373
1.8165 5.1487 1320 1.8196
1.7256 5.1877 1330 1.8280
1.8193 5.2267 1340 1.8230
1.9184 5.2657 1350 1.8317
1.9165 5.3047 1360 1.8342
1.9257 5.3437 1370 1.8116
1.7266 5.3827 1380 1.8229
1.819 5.4217 1390 1.8162
1.9576 5.4608 1400 1.8311
1.9448 5.4998 1410 1.8251
1.8438 5.5388 1420 1.8083
1.7266 5.5778 1430 1.8173
1.8128 5.6168 1440 1.8113
1.9558 5.6558 1450 1.8206
1.9478 5.6948 1460 1.8177
1.8407 5.7338 1470 1.8036
1.7654 5.7728 1480 1.8194
1.8663 5.8118 1490 1.8082
1.9504 5.8508 1500 1.8202
1.9463 5.8898 1510 1.8196
1.7968 5.9288 1520 1.8029
1.7647 5.9678 1530 1.8152
1.7266 6.0068 1540 1.8015
1.9003 6.0458 1550 1.8100
1.9034 6.0848 1560 1.8137
1.8396 6.1238 1570 1.8103
1.7832 6.1628 1580 1.8055
1.6215 6.2019 1590 1.8197
1.9618 6.2409 1600 1.8072
1.897 6.2799 1610 1.8152
1.8719 6.3189 1620 1.8071
1.8327 6.3579 1630 1.8009
1.6011 6.3969 1640 1.8202
1.9512 6.4359 1650 1.8061
1.8925 6.4749 1660 1.8183
1.8759 6.5139 1670 1.8104
1.7985 6.5529 1680 1.7971
1.6509 6.5919 1690 1.8161
1.9336 6.6309 1700 1.7963
1.9162 6.6699 1710 1.8053
1.8663 6.7089 1720 1.8038
1.8258 6.7479 1730 1.7938
1.59 6.7869 1740 1.8156
1.9273 6.8259 1750 1.7902
1.8987 6.8649 1760 1.8011
1.8492 6.9039 1770 1.8008
1.7775 6.9430 1780 1.7896
1.5814 6.9820 1790 1.8116
1.8407 7.0210 1800 1.7896
1.8887 7.0600 1810 1.7979
1.9142 7.0990 1820 1.8057
1.7626 7.1380 1830 1.7939
1.7565 7.1770 1840 1.7964
1.6147 7.2160 1850 1.7995
1.8237 7.2550 1860 1.7914
1.8857 7.2940 1870 1.7981
1.8479 7.3330 1880 1.7916
1.7318 7.3720 1890 1.7945
1.6457 7.4110 1900 1.7967
1.8955 7.4500 1910 1.7870
1.9229 7.4890 1920 1.7929
1.8029 7.5280 1930 1.7899
1.7959 7.5670 1940 1.7871
1.6022 7.6060 1950 1.8001
1.9062 7.6451 1960 1.7833
1.9102 7.6841 1970 1.7858
1.8149 7.7231 1980 1.7854
1.7492 7.7621 1990 1.7861
1.6135 7.8011 2000 1.7939
1.8301 7.8401 2010 1.7817
1.8828 7.8791 2020 1.7844
1.754 7.9181 2030 1.7823
1.7694 7.9571 2040 1.7831
1.6105 7.9961 2050 1.7867
1.8275 8.0351 2060 1.7847
1.7555 8.0741 2070 1.7868
1.8719 8.1131 2080 1.7861
1.7381 8.1521 2090 1.7837
1.6302 8.1911 2100 1.8011
1.7826 8.2301 2110 1.7855
1.8958 8.2691 2120 1.7826
1.852 8.3081 2130 1.7842
1.748 8.3471 2140 1.7819
1.5133 8.3862 2150 1.7930
1.8326 8.4252 2160 1.7814
1.8598 8.4642 2170 1.7784
1.8627 8.5032 2180 1.7821
1.7473 8.5422 2190 1.7797
1.642 8.5812 2200 1.7848
1.8286 8.6202 2210 1.7777
1.7744 8.6592 2220 1.7803
1.825 8.6982 2230 1.7779
1.7296 8.7372 2240 1.7785
1.5304 8.7762 2250 1.7871
1.747 8.8152 2260 1.7760
1.8467 8.8542 2270 1.7778
1.8316 8.8932 2280 1.7769
1.7574 8.9322 2290 1.7732
1.6203 8.9712 2300 1.7821
1.742 9.0102 2310 1.7773
1.7971 9.0492 2320 1.7799
1.8526 9.0882 2330 1.7805
1.7679 9.1273 2340 1.7821
1.6806 9.1663 2350 1.7809
1.4834 9.2053 2360 1.7905
1.829 9.2443 2370 1.7772
1.8293 9.2833 2380 1.7783
1.7669 9.3223 2390 1.7767
1.7476 9.3613 2400 1.7753
1.5542 9.4003 2410 1.7906
1.8146 9.4393 2420 1.7747
1.805 9.4783 2430 1.7772
1.7651 9.5173 2440 1.7778
1.7134 9.5563 2450 1.7757
1.5159 9.5953 2460 1.7883
1.8303 9.6343 2470 1.7727
1.833 9.6733 2480 1.7777
1.8066 9.7123 2490 1.7735
1.7443 9.7513 2500 1.7769
1.5211 9.7903 2510 1.7894
1.8166 9.8294 2520 1.7726
1.8283 9.8684 2530 1.7754
1.8072 9.9074 2540 1.7718
1.704 9.9464 2550 1.7751
1.5475 9.9854 2560 1.7846
1.7583 10.0244 2570 1.7745
1.7812 10.0634 2580 1.7760
1.7514 10.1024 2590 1.7782
1.6958 10.1414 2600 1.7743
1.6448 10.1804 2610 1.7814
1.6535 10.2194 2620 1.7832
1.863 10.2584 2630 1.7734
1.7768 10.2974 2640 1.7782
1.7197 10.3364 2650 1.7740
1.7241 10.3754 2660 1.7787
1.5929 10.4144 2670 1.7812
1.8043 10.4534 2680 1.7727
1.8393 10.4924 2690 1.7775
1.7536 10.5314 2700 1.7729
1.6587 10.5705 2710 1.7802
1.593 10.6095 2720 1.7824
1.8398 10.6485 2730 1.7725
1.8289 10.6875 2740 1.7765
1.7409 10.7265 2750 1.7738
1.6725 10.7655 2760 1.7778
1.6102 10.8045 2770 1.7810
1.7391 10.8435 2780 1.7710
1.751 10.8825 2790 1.7737
1.6602 10.9215 2800 1.7726
1.6149 10.9605 2810 1.7780
1.5124 10.9995 2820 1.7788
1.8017 11.0385 2830 1.7731
1.7256 11.0775 2840 1.7764
1.7597 11.1165 2850 1.7756
1.6996 11.1555 2860 1.7732
1.4229 11.1945 2870 1.7867
1.7985 11.2335 2880 1.7743
1.7496 11.2725 2890 1.7755
1.8179 11.3116 2900 1.7767
1.6852 11.3506 2910 1.7729
1.4504 11.3896 2920 1.7864
1.7277 11.4286 2930 1.7751
1.786 11.4676 2940 1.7730
1.7619 11.5066 2950 1.7744
1.6852 11.5456 2960 1.7718
1.4457 11.5846 2970 1.7845
1.8336 11.6236 2980 1.7750
1.7752 11.6626 2990 1.7715
1.7486 11.7016 3000 1.7731
1.6407 11.7406 3010 1.7728
1.5157 11.7796 3020 1.7797
1.7665 11.8186 3030 1.7730
1.7849 11.8576 3040 1.7724
1.7888 11.8966 3050 1.7738
1.7117 11.9356 3060 1.7703
1.4887 11.9746 3070 1.7802
1.7605 12.0137 3080 1.7763
1.7601 12.0527 3090 1.7729
1.7629 12.0917 3100 1.7772
1.7879 12.1307 3110 1.7705
1.618 12.1697 3120 1.7783
1.4831 12.2087 3130 1.7898
1.7599 12.2477 3140 1.7718
1.7936 12.2867 3150 1.7757
1.686 12.3257 3160 1.7778
1.6687 12.3647 3170 1.7751
1.5258 12.4037 3180 1.7826
1.7463 12.4427 3190 1.7747
1.7613 12.4817 3200 1.7738
1.7218 12.5207 3210 1.7730
1.6481 12.5597 3220 1.7736
1.4752 12.5987 3230 1.7842
1.7901 12.6377 3240 1.7730
1.8045 12.6767 3250 1.7727
1.7731 12.7157 3260 1.7709
1.6888 12.7548 3270 1.7727
1.4489 12.7938 3280 1.7822
1.8051 12.8328 3290 1.7712
1.7413 12.8718 3300 1.7720
1.6658 12.9108 3310 1.7744
1.6568 12.9498 3320 1.7724
1.4916 12.9888 3330 1.7812
1.6604 13.0278 3340 1.7756
1.78 13.0668 3350 1.7741
1.7498 13.1058 3360 1.7770
1.6802 13.1448 3370 1.7746
1.5932 13.1838 3380 1.7772
1.5691 13.2228 3390 1.7824
1.7558 13.2618 3400 1.7726
1.8435 13.3008 3410 1.7734
1.6418 13.3398 3420 1.7744
1.5933 13.3788 3430 1.7780
1.558 13.4178 3440 1.7791
1.8047 13.4569 3450 1.7728
1.7587 13.4959 3460 1.7743
1.696 13.5349 3470 1.7741
1.5533 13.5739 3480 1.7765
1.5472 13.6129 3490 1.7795
1.6872 13.6519 3500 1.7737
1.7398 13.6909 3510 1.7726
1.6902 13.7299 3520 1.7737
1.6542 13.7689 3530 1.7751
1.5921 13.8079 3540 1.7773
1.7336 13.8469 3550 1.7737
1.7588 13.8859 3560 1.7747
1.6348 13.9249 3570 1.7732
1.5502 13.9639 3580 1.7781
1.522 14.0029 3590 1.7787
1.7596 14.0419 3600 1.7735
1.7118 14.0809 3610 1.7747
1.7507 14.1199 3620 1.7761
1.6661 14.1589 3630 1.7746
1.4149 14.1980 3640 1.7810
1.7937 14.2370 3650 1.7781
1.7723 14.2760 3660 1.7736
1.7098 14.3150 3670 1.7756
1.6265 14.3540 3680 1.7776
1.4276 14.3930 3690 1.7821
1.7301 14.4320 3700 1.7783
1.7402 14.4710 3710 1.7753
1.7258 14.5100 3720 1.7756
1.6407 14.5490 3730 1.7758
1.3929 14.5880 3740 1.7812
1.7925 14.6270 3750 1.7786
1.7348 14.6660 3760 1.7742
1.7815 14.7050 3770 1.7743
1.667 14.7440 3780 1.7730
1.4567 14.7830 3790 1.7781
1.7649 14.8220 3800 1.7790
1.7334 14.8610 3810 1.7747
1.6922 14.9000 3820 1.7735
1.6162 14.9391 3830 1.7742
1.38 14.9781 3840 1.7789
1.6846 15.0171 3850 1.7780
1.6927 15.0561 3860 1.7749
1.8049 15.0951 3870 1.7742
1.7247 15.1341 3880 1.7747
1.6067 15.1731 3890 1.7761
1.4859 15.2121 3900 1.7805
1.7554 15.2511 3910 1.7764
1.7054 15.2901 3920 1.7766
1.7034 15.3291 3930 1.7772
1.6364 15.3681 3940 1.7760
1.4598 15.4071 3950 1.7795
1.7259 15.4461 3960 1.7781
1.7547 15.4851 3970 1.7758
1.6734 15.5241 3980 1.7763
1.6257 15.5631 3990 1.7770
1.4633 15.6021 4000 1.7799
1.6762 15.6412 4010 1.7782
1.7339 15.6802 4020 1.7761
1.6549 15.7192 4030 1.7761
1.5622 15.7582 4040 1.7783
1.4879 15.7972 4050 1.7822
1.7163 15.8362 4060 1.7783
1.8122 15.8752 4070 1.7755
1.7154 15.9142 4080 1.7750
1.6695 15.9532 4090 1.7748
1.4664 15.9922 4100 1.7783
1.6513 16.0312 4110 1.7775
1.7294 16.0702 4120 1.7773
1.7467 16.1092 4130 1.7765
1.5965 16.1482 4140 1.7756
1.5172 16.1872 4150 1.7774
1.6064 16.2262 4160 1.7789
1.7203 16.2652 4170 1.7783
1.7374 16.3042 4180 1.7768
1.6223 16.3432 4190 1.7752
1.5243 16.3823 4200 1.7757
1.603 16.4213 4210 1.7788
1.7494 16.4603 4220 1.7788
1.7415 16.4993 4230 1.7771
1.6295 16.5383 4240 1.7754
1.5225 16.5773 4250 1.7768
1.5805 16.6163 4260 1.7790
1.7568 16.6553 4270 1.7786
1.7337 16.6943 4280 1.7765
1.6712 16.7333 4290 1.7748
1.5539 16.7723 4300 1.7760
1.5777 16.8113 4310 1.7779
1.7397 16.8503 4320 1.7771
1.7327 16.8893 4330 1.7764
1.6784 16.9283 4340 1.7764
1.5556 16.9673 4350 1.7773
1.5201 17.0063 4360 1.7795
1.7309 17.0453 4370 1.7793
1.724 17.0843 4380 1.7782
1.739 17.1234 4390 1.7773
1.5871 17.1624 4400 1.7764
1.4126 17.2014 4410 1.7781
1.7192 17.2404 4420 1.7789
1.6981 17.2794 4430 1.7782
1.7179 17.3184 4440 1.7779
1.6096 17.3574 4450 1.7776
1.375 17.3964 4460 1.7789
1.6688 17.4354 4470 1.7793
1.6978 17.4744 4480 1.7786
1.7511 17.5134 4490 1.7780
1.6402 17.5524 4500 1.7772
1.4435 17.5914 4510 1.7775
1.7534 17.6304 4520 1.7780
1.7607 17.6694 4530 1.7778
1.771 17.7084 4540 1.7771
1.6721 17.7474 4550 1.7764
1.3991 17.7864 4560 1.7769
1.7386 17.8255 4570 1.7776
1.7345 17.8645 4580 1.7781
1.6783 17.9035 4590 1.7781
1.5988 17.9425 4600 1.7777
1.4249 17.9815 4610 1.7779
1.697 18.0205 4620 1.7782
1.7454 18.0595 4630 1.7782
1.7218 18.0985 4640 1.7779
1.6152 18.1375 4650 1.7775
1.6062 18.1765 4660 1.7772
1.4834 18.2155 4670 1.7777
1.711 18.2545 4680 1.7779
1.7126 18.2935 4690 1.7778
1.7342 18.3325 4700 1.7773
1.6363 18.3715 4710 1.7768
1.4322 18.4105 4720 1.7773
1.7839 18.4495 4730 1.7775
1.7095 18.4885 4740 1.7776
1.6805 18.5275 4750 1.7775
1.5983 18.5666 4760 1.7776
1.4826 18.6056 4770 1.7780
1.7331 18.6446 4780 1.7783
1.7222 18.6836 4790 1.7782
1.6478 18.7226 4800 1.7780
1.616 18.7616 4810 1.7779
1.5055 18.8006 4820 1.7781
1.7357 18.8396 4830 1.7782
1.7558 18.8786 4840 1.7782
1.6319 18.9176 4850 1.7782
1.596 18.9566 4860 1.7782
1.4068 18.9956 4870 1.7784
1.729 19.0346 4880 1.7784
1.7094 19.0736 4890 1.7784
1.744 19.1126 4900 1.7784
1.6065 19.1516 4910 1.7784
1.4862 19.1906 4920 1.7784
1.6656 19.2296 4930 1.7784
1.7731 19.2686 4940 1.7784
1.7516 19.3077 4950 1.7784
1.6241 19.3467 4960 1.7783
1.479 19.3857 4970 1.7783
1.6246 19.4247 4980 1.7783
1.7093 19.4637 4990 1.7783
1.7031 19.5027 5000 1.7783
1.6452 19.5417 5010 1.7783
1.4704 19.5807 5020 1.7783
1.6488 19.6197 5030 1.7783
1.7266 19.6587 5040 1.7783
1.7535 19.6977 5050 1.7783
1.586 19.7367 5060 1.7783
1.4617 19.7757 5070 1.7783
1.5699 19.8147 5080 1.7783
1.6759 19.8537 5090 1.7783
1.7787 19.8927 5100 1.7783
1.6833 19.9317 5110 1.7783
1.4781 19.9707 5120 1.7783

Framework versions

  • PEFT 0.12.0
  • Transformers 4.45.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.20.1
Downloads last month
7
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for SangMoone/results_1024_hard

Base model

google/gemma-2b-it
Adapter
(540)
this model