gpt2

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.0	6	1.2395
No log	2.0	12	1.0996
No log	3.0	18	0.9962
No log	4.0	24	0.8884
No log	5.0	30	0.8230
No log	6.0	36	0.7659
No log	7.0	42	0.7012
No log	8.0	48	0.6511
No log	9.0	54	0.6197
No log	10.0	60	0.5812
No log	11.0	66	0.5467
No log	12.0	72	0.5068
No log	13.0	78	0.4874
No log	14.0	84	0.4649
No log	15.0	90	0.4498
No log	16.0	96	0.4297
No log	17.0	102	0.4009
No log	18.0	108	0.3867
No log	19.0	114	0.3814
No log	20.0	120	0.3634
No log	21.0	126	0.3476
No log	22.0	132	0.3386
No log	23.0	138	0.3316
No log	24.0	144	0.3207
No log	25.0	150	0.3140
No log	26.0	156	0.3073
No log	27.0	162	0.3028
No log	28.0	168	0.3011
No log	29.0	174	0.2986
No log	30.0	180	0.2981