amdchess-v9

This model is a fine-tuned version of amd/AMD-Llama-135m on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6367

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use grokadamw with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
1.4763	0.0100	17	1.4222
1.0937	0.0201	34	1.1053
1.0732	0.0301	51	1.0270
0.991	0.0401	68	0.9671
1.0235	0.0502	85	0.9474
0.8849	0.0602	102	0.9239
0.9108	0.0702	119	0.8907
0.8907	0.0803	136	0.8745
0.8685	0.0903	153	0.8619
0.9375	0.1004	170	0.8547
0.7897	0.1104	187	0.8412
0.8594	0.1204	204	0.8293
0.8495	0.1305	221	0.8226
0.8618	0.1405	238	0.8129
0.8643	0.1505	255	0.8052
0.7375	0.1606	272	0.7985
0.7322	0.1706	289	0.7953
0.7991	0.1806	306	0.7923
0.8269	0.1907	323	0.7856
0.8031	0.2007	340	0.7776
0.7605	0.2107	357	0.7737
0.804	0.2208	374	0.7664
0.7683	0.2308	391	0.7600
0.7667	0.2409	408	0.7610
0.7823	0.2509	425	0.7508
0.7608	0.2609	442	0.7484
0.7291	0.2710	459	0.7457
0.8157	0.2810	476	0.7393
0.7526	0.2910	493	0.7353
0.7099	0.3011	510	0.7360
0.8242	0.3111	527	0.7331
0.7849	0.3211	544	0.7285
0.7558	0.3312	561	0.7224
0.6278	0.3412	578	0.7225
0.7135	0.3512	595	0.7197
0.6425	0.3613	612	0.7180
0.7721	0.3713	629	0.7137
0.8091	0.3813	646	0.7097
0.7518	0.3914	663	0.7063
0.7299	0.4014	680	0.7053
0.7563	0.4115	697	0.7051
0.658	0.4215	714	0.6997
0.7096	0.4315	731	0.6966
0.7555	0.4416	748	0.6954
0.7292	0.4516	765	0.6936
0.6349	0.4616	782	0.6908
0.6996	0.4717	799	0.6892
0.6849	0.4817	816	0.6892
0.7023	0.4917	833	0.6847
0.6547	0.5018	850	0.6850
0.7549	0.5118	867	0.6826
0.6987	0.5218	884	0.6798
0.648	0.5319	901	0.6796
0.7308	0.5419	918	0.6775
0.7245	0.5519	935	0.6756
0.6915	0.5620	952	0.6745
0.7287	0.5720	969	0.6716
0.739	0.5821	986	0.6704
0.7168	0.5921	1003	0.6686
0.685	0.6021	1020	0.6671
0.7183	0.6122	1037	0.6656
0.7138	0.6222	1054	0.6644
0.6738	0.6322	1071	0.6620
0.634	0.6423	1088	0.6611
0.703	0.6523	1105	0.6606
0.6538	0.6623	1122	0.6584
0.7167	0.6724	1139	0.6564
0.6717	0.6824	1156	0.6545
0.6633	0.6924	1173	0.6538
0.6035	0.7025	1190	0.6535
0.6444	0.7125	1207	0.6514
0.7171	0.7226	1224	0.6502
0.7157	0.7326	1241	0.6489
0.7028	0.7426	1258	0.6480
0.681	0.7527	1275	0.6479
0.6711	0.7627	1292	0.6464
0.7113	0.7727	1309	0.6454
0.7329	0.7828	1326	0.6454
0.694	0.7928	1343	0.6436
0.6304	0.8028	1360	0.6431
0.7129	0.8129	1377	0.6420
0.6531	0.8229	1394	0.6411
0.6791	0.8329	1411	0.6406
0.6963	0.8430	1428	0.6401
0.6285	0.8530	1445	0.6402
0.6484	0.8630	1462	0.6398
0.6505	0.8731	1479	0.6394
0.6985	0.8831	1496	0.6389
0.6643	0.8932	1513	0.6386
0.6292	0.9032	1530	0.6381
0.6237	0.9132	1547	0.6377
0.6159	0.9233	1564	0.6375
0.7027	0.9333	1581	0.6372
0.7068	0.9433	1598	0.6371
0.6021	0.9534	1615	0.6369
0.6812	0.9634	1632	0.6368
0.6805	0.9734	1649	0.6368
0.628	0.9835	1666	0.6367
0.6507	0.9935	1683	0.6367

Framework versions

Transformers 4.46.1
Pytorch 2.5.0+cu121
Datasets 3.1.0
Tokenizers 0.20.1

nlpguy
/

amdchess-v9

amdchess-v9

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for nlpguy/amdchess-v9

Evaluation results