sonthenguyen
commited on
Commit
•
7541c3c
1
Parent(s):
f60f1f6
Update README.md
Browse files
README.md
CHANGED
@@ -6,4 +6,6 @@ Training hyperparameters LoRA: r=16 lora_alpha=16 lora_dropout=0.05 bias="none"
|
|
6 |
|
7 |
Training arguments: auto_find_batch_size=True gradient_checkpointing=True learning_rate=5e-7 lr_scheduler_type="cosine" max_steps=3922 optim="paged_adamw_32bit" warmup_steps=100
|
8 |
|
9 |
-
DPOTrainer: beta=0.1 max_prompt_length=1024 max_length=1536
|
|
|
|
|
|
6 |
|
7 |
Training arguments: auto_find_batch_size=True gradient_checkpointing=True learning_rate=5e-7 lr_scheduler_type="cosine" max_steps=3922 optim="paged_adamw_32bit" warmup_steps=100
|
8 |
|
9 |
+
DPOTrainer: beta=0.1 max_prompt_length=1024 max_length=1536
|
10 |
+
|
11 |
+
Arxiv link: https://arxiv.org/abs/2403.02745
|