gemma2_on_korean_conv

This model is a fine-tuned version of beomi/gemma-ko-2b on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
1.236	0.1281	100	1.3164
1.1365	0.2563	200	1.2335
1.125	0.3844	300	1.1832
1.1023	0.5126	400	1.1517
1.1244	0.6407	500	1.1254
1.0095	0.7688	600	1.1015
1.1354	0.8970	700	1.0979
0.898	1.0251	800	1.0986
0.9075	1.1533	900	1.0897
0.864	1.2814	1000	1.0924
0.9093	1.4095	1100	1.0810
0.8207	1.5377	1200	1.0859
0.8376	1.6658	1300	1.0712
0.8546	1.7940	1400	1.0705
0.8231	1.9221	1500	1.0659
0.6411	2.0502	1600	1.1030
0.6646	2.1784	1700	1.1065
0.6662	2.3065	1800	1.1038
0.6596	2.4346	1900	1.1033
0.6761	2.5628	2000	1.1137
0.7028	2.6909	2100	1.1071
0.6339	2.8191	2200	1.1076
0.6714	2.9472	2300	1.1157
0.5146	3.0753	2400	1.1607
0.4817	3.2035	2500	1.1779
0.5094	3.3316	2600	1.1794
0.4954	3.4598	2700	1.1887
0.4886	3.5879	2800	1.1888
0.5176	3.7160	2900	1.1819
0.5076	3.8442	3000	1.1900
0.5286	3.9723	3100	1.1838
0.3827	4.1005	3200	1.2526
0.3933	4.2286	3300	1.2663
0.3703	4.3567	3400	1.2671
0.3967	4.4849	3500	1.2661
0.3978	4.6130	3600	1.2691