File size: 665 Bytes
168a699
 
 
 
 
 
 
 
 
 
 
86c51cc
168a699
 
 
 
 
86c51cc
168a699
 
 
86c51cc
168a699
 
 
86c51cc
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
library_name: transformers
tags:
- trl
- dpo
---

# Model Card for Model ID

## Model Details

Finetune Llama-3-8B model with Orca-DPO dataset.

## Training Details

### Training Data

Trained on Orca dataset (DPO).

### Training Procedure

Add NEFTune module for robustness, and fine-tune the model with DPO trainer.

#### Training Hyperparameters

- lora_alpha = 16
- lora_r = 64
- lora_dropout = 0.1
- adam_beta1 = 0.9
- adam_beta2 = 0.999
- weight_decay = 0.001
- max_grad_norm = 0.3
- learning_rate = 2e-4
- bnb_4bit_quant_type = nf4
- optim = "paged_adamw_32bit"
- optimizer_type = "paged_adamw_32bit"
- max_steps = 5000
- gradient_accumulation_steps = 4