andreaskoepf commited on
Commit
93fa8ab
1 Parent(s): 88ee8f9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md CHANGED
@@ -1,3 +1,49 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ https://wandb.ai/open-assistant/supervised-finetuning/runs/7pz5n33h
5
+ exported checkpoint: 4500 steps
6
+
7
+ datasets:
8
+ ```
9
+ oasst_export_eu:
10
+ datasets:
11
+ - oasst_export:
12
+ lang: "en,es,de,fr"
13
+ input_file_path: 2023-03-27_oasst_research_ready_synth.jsonl.gz
14
+ - alpaca
15
+ - oig_file:
16
+ source_url: https://huggingface.co/datasets/laion/OIG/resolve/main/unified_chip2.jsonl
17
+ max_count: 15000
18
+ min_length: 500
19
+ val_split: 0.2
20
+ - oig_file:
21
+ source_url: https://huggingface.co/datasets/laion/OIG/raw/main/unified_grade_school_math_instructions.jsonl
22
+ val_split: 0.1
23
+ min_length: 1000
24
+ sort_by_length: false
25
+ use_custom_sampler: false
26
+ ```
27
+
28
+ pythia:
29
+ ```
30
+ pythia-12b:
31
+ fp16: true
32
+ log_dir: "pythia_log_12b"
33
+ learning_rate: 6e-6
34
+ model_name: EleutherAI/pythia-12b-deduped
35
+ output_dir: pythia_model_12b
36
+ weight_decay: 0.0
37
+ residual_dropout: 0.2
38
+ max_length: 2048
39
+ use_flash_attention: true
40
+ warmup_steps: 100
41
+ gradient_checkpointing: false
42
+ gradient_accumulation_steps: 4
43
+ per_device_train_batch_size: 2
44
+ per_device_eval_batch_size: 5
45
+ eval_steps: 200
46
+ save_steps: 500
47
+ num_train_epochs: 16
48
+ save_total_limit: 4
49
+ ```