Text Generation
English
sft
jordiclive commited on
Commit
eadbb15
1 Parent(s): 732a0ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -3
README.md CHANGED
@@ -6,7 +6,11 @@ datasets:
6
  - yahma/alpaca-cleaned
7
  ---
8
 
9
- This repo contains a low-rank adapter for LLaMA-7b fit on `Nebulous/gpt4all_pruned`, `sahil2801/CodeAlpaca-20k`, `yahma/alpaca-cleaned` and some datasets part of the OpenAssistant project.
 
 
 
 
10
 
11
 
12
  This version of the weights was trained with the following hyperparameters:
@@ -15,5 +19,8 @@ This version of the weights was trained with the following hyperparameters:
15
  - Batch size: 128
16
  - Max Length: 2048
17
  - Learning rate: 4e-6
18
- - Lora _r_: 16
19
- - Lora target modules: q_proj, k_proj, v_proj, o_proj
 
 
 
 
6
  - yahma/alpaca-cleaned
7
  ---
8
 
9
+ This repo contains a low-rank adapter for **LLaMA-7b** fit on
10
+ - `Nebulous/gpt4all_pruned`
11
+ - `sahil2801/CodeAlpaca-20k`
12
+ - `yahma/alpaca-cleaned`
13
+ - datasets part of the OpenAssistant project.
14
 
15
 
16
  This version of the weights was trained with the following hyperparameters:
 
19
  - Batch size: 128
20
  - Max Length: 2048
21
  - Learning rate: 4e-6
22
+ - Lora _r_: 8
23
+ - Lora Alpha: 32
24
+ - Lora target modules: q_proj, k_proj, v_proj, o_proj
25
+
26
+ The model was trained with flash attention and gradient checkpointing.