File size: 665 Bytes
168a699 86c51cc 168a699 86c51cc 168a699 86c51cc 168a699 86c51cc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
---
library_name: transformers
tags:
- trl
- dpo
---
# Model Card for Model ID
## Model Details
Finetune Llama-3-8B model with Orca-DPO dataset.
## Training Details
### Training Data
Trained on Orca dataset (DPO).
### Training Procedure
Add NEFTune module for robustness, and fine-tune the model with DPO trainer.
#### Training Hyperparameters
- lora_alpha = 16
- lora_r = 64
- lora_dropout = 0.1
- adam_beta1 = 0.9
- adam_beta2 = 0.999
- weight_decay = 0.001
- max_grad_norm = 0.3
- learning_rate = 2e-4
- bnb_4bit_quant_type = nf4
- optim = "paged_adamw_32bit"
- optimizer_type = "paged_adamw_32bit"
- max_steps = 5000
- gradient_accumulation_steps = 4
|