collapse_gemma-2-27b_hs2_accumulate_iter2_sftsd2

This model is a fine-tuned version of google/gemma-2-27b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.1282	0
1.6524	0.0267	5	1.0225	246112
1.8688	0.0534	10	0.9766	491112
1.4621	0.0802	15	0.9656	736432
1.3566	0.1069	20	0.9668	984696
1.3829	0.1336	25	0.9667	1227844
1.1059	0.1603	30	0.9655	1474208
1.0213	0.1870	35	0.9604	1730332
0.8427	0.2138	40	0.9554	1973604
0.9259	0.2405	45	0.9537	2222012
1.0387	0.2672	50	0.9480	2472664
0.9934	0.2939	55	0.9463	2723764
0.8751	0.3206	60	0.9401	2973420
0.8539	0.3474	65	0.9390	3224868
0.7838	0.3741	70	0.9367	3468436
0.7819	0.4008	75	0.9336	3722388
0.7431	0.4275	80	0.9317	3975588
0.7116	0.4542	85	0.9302	4223548
0.7088	0.4810	90	0.9305	4476068
0.6615	0.5077	95	0.9289	4721956
0.7609	0.5344	100	0.9296	4969064
0.7459	0.5611	105	0.9293	5219084
0.7784	0.5878	110	0.9288	5462520
0.7836	0.6146	115	0.9275	5710096
0.7615	0.6413	120	0.9291	5960688
0.7463	0.6680	125	0.9255	6210428
0.7071	0.6947	130	0.9282	6458116
0.7189	0.7214	135	0.9242	6700308
0.6639	0.7482	140	0.9256	6951476
0.6825	0.7749	145	0.9237	7202416
0.7322	0.8016	150	0.9253	7452600
0.7126	0.8283	155	0.9246	7695916
0.6821	0.8550	160	0.9228	7939444
0.6741	0.8818	165	0.9242	8188360
0.7033	0.9085	170	0.9237	8432324
0.6131	0.9352	175	0.9221	8679992
0.7369	0.9619	180	0.9210	8924020
0.732	0.9886	185	0.9247	9175312