license: cc-by-4.0 | |
**pythia-1.4B-finetuned-oa-instructions** | |
This model is a fine-tuned version of pythia on the oa dataset. It achieves the following results on the evaluation set: | |
Loss: 0.1224 | |
**Model description** | |
More information needed | |
Intended uses & limitations | |
More information needed | |
**Training and evaluation data** | |
More information needed | |
**Training procedure** | |
**Training hyperparameters** | |
The following hyperparameters were used during training: | |
* seed: 42 | |
* learning_rate: 5e-06 | |
* train_batch_size: 32 | |
* eval_batch_size: 8 | |
* optimizer: Adam with betas : {'lr': 5e-06, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.0} | |
* lr_scheduler_type: linear | |
* training_steps: 5000 | |
* fp16 | |
* warmup_steps 5 | |
* Num examples = 53k | |
**Training results** | |
``` | |
{ | |
"epoch": 1.0, | |
"train_loss": 0.8031303182039198, | |
"train_runtime": 6338.6403, | |
"train_samples": 53455, | |
"train_samples_per_second": 8.433, | |
"train_steps_per_second": 0.264 | |
} | |
``` | |
**Framework versions** | |
* transformers 4.24.0 | |
* torch 1.10.0+cu111 | |
* datasets 2.10.0 | |
* tokenizers 0.12.1 |