collapse_gemma-2-27b_hs2_accumulate_iter2_sftsd1

This model is a fine-tuned version of google/gemma-2-27b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.1282	0
1.6371	0.0278	5	1.0191	267540
1.6662	0.0555	10	0.9724	529072
1.3682	0.0833	15	0.9593	785900
1.465	0.1111	20	0.9571	1044336
1.4939	0.1388	25	0.9602	1302088
1.2547	0.1666	30	0.9604	1553344
1.3517	0.1944	35	0.9564	1808072
1.2594	0.2221	40	0.9531	2069656
1.122	0.2499	45	0.9501	2318876
1.1374	0.2777	50	0.9461	2574304
1.0402	0.3054	55	0.9441	2835460
0.9125	0.3332	60	0.9417	3100792
0.9725	0.3610	65	0.9371	3359628
1.0081	0.3888	70	0.9372	3624016
0.9675	0.4165	75	0.9346	3880016
1.0841	0.4443	80	0.9351	4126948
1.0015	0.4721	85	0.9313	4380144
1.0436	0.4998	90	0.9319	4645252
1.0193	0.5276	95	0.9298	4896128
1.0469	0.5554	100	0.9291	5148796
0.8706	0.5831	105	0.9269	5411164
0.8656	0.6109	110	0.9262	5663420
1.0066	0.6387	115	0.9243	5931756
0.8539	0.6664	120	0.9247	6192724
0.9333	0.6942	125	0.9233	6447744
0.8919	0.7220	130	0.9224	6704364
0.8694	0.7497	135	0.9221	6955692
0.916	0.7775	140	0.9211	7212120
0.9457	0.8053	145	0.9215	7469356
0.8997	0.8330	150	0.9199	7730508
0.8992	0.8608	155	0.9206	7979484
0.9604	0.8886	160	0.9191	8234480
0.9	0.9163	165	0.9186	8489188
0.9385	0.9441	170	0.9202	8745136
0.964	0.9719	175	0.9175	9000760
0.9423	0.9997	180	0.9205	9260048