Plainly Optimized Network
Dataset: BIGBENCH
Trainer Hyperparameters:
lr
= 5e-05per_device_batch_size
= 8gradient_accumulation_steps
= 2weight_decay
= 0.0seed
= 42
eval_loss | eval_accuracy | epoch |
---|---|---|
10.379 | 0.571 | 1.0 |
9.388 | 0.643 | 2.0 |
10.286 | 0.571 | 3.0 |
10.324 | 0.571 | 4.0 |
10.254 | 0.571 | 5.0 |
10.166 | 0.571 | 6.0 |
10.122 | 0.571 | 7.0 |
10.020 | 0.571 | 8.0 |
10.035 | 0.571 | 9.0 |
9.961 | 0.571 | 10.0 |
9.963 | 0.571 | 11.0 |
9.962 | 0.571 | 12.0 |
9.990 | 0.500 | 13.0 |
10.817 | 0.571 | 14.0 |
10.030 | 0.571 | 15.0 |
10.049 | 0.571 | 16.0 |
10.057 | 0.571 | 17.0 |
10.067 | 0.571 | 18.0 |
10.080 | 0.571 | 19.0 |
- Downloads last month
- 2