Undi95 commited on
Commit
47a417a
1 Parent(s): 8558f28

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -116
README.md CHANGED
@@ -1,137 +1,74 @@
1
  ---
2
  license: cc-by-nc-4.0
3
- base_model: NeverSleep/Noromaid-7B-0.4
4
- tags:
5
- - generated_from_trainer
6
- model-index:
7
- - name: out-dpo
8
- results: []
9
  ---
10
 
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
-
14
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
15
- <details><summary>See axolotl config</summary>
16
-
17
- axolotl version: `0.3.0`
18
- ```yaml
19
- base_model: NeverSleep/Noromaid-7B-0.4
20
- model_type: MistralForCausalLM
21
- tokenizer_type: LlamaTokenizer
22
- is_mistral_derived_model: true
23
-
24
- load_in_8bit: false
25
- load_in_4bit: false
26
- strict: false
27
-
28
- rl: true
29
- datasets:
30
- - path: Intel/orca_dpo_pairs
31
- split: train
32
- type: intel_apply_chatml
33
- - path: NobodyExistsOnTheInternet/ToxicDPOqa
34
- split: train
35
- type: intel_apply_chatml
36
- - path: Undi95/toxic-dpo-v0.1-NoWarning
37
- split: train
38
- type: intel_apply_chatml
39
-
40
- dataset_prepared_path: last_run_prepared
41
- val_set_size: 0.0
42
- output_dir: ./out-dpo
43
- save_total_limit: 3
44
-
45
- sequence_len: 2048
46
- sample_packing: false
47
- pad_to_sequence_len: false
48
-
49
-
50
- wandb_project: Noromaid-Chatml-dpo
51
- wandb_entity:
52
- wandb_watch:
53
- wandb_run_id:
54
- wandb_log_model:
55
-
56
- gradient_accumulation_steps: 16
57
- micro_batch_size: 2
58
- num_epochs: 1
59
- optimizer: paged_adamw_8bit
60
- lr_scheduler: cosine
61
- learning_rate: 2e-8
62
-
63
- train_on_inputs: false
64
- group_by_length: false
65
- bf16: true
66
- fp16: false
67
- tf32: false
68
-
69
- gradient_checkpointing: true
70
- early_stopping_patience:
71
- resume_from_checkpoint:
72
- local_rank:
73
- logging_steps: 1
74
- xformers_attention:
75
- flash_attention: true
76
-
77
- warmup_steps: 10
78
- eval_steps:
79
- eval_table_size:
80
- eval_table_max_new_tokens: 128
81
- debug:
82
- deepspeed:
83
- weight_decay: 0.1
84
- fsdp:
85
- fsdp_config:
86
- special_tokens:
87
- eos_token: "<|im_end|>"
88
- tokens:
89
- - "<|im_start|>"
90
- save_safetensors: true
91
- saves_per_epoch: 2
92
- ```
93
 
94
- </details><br>
95
 
96
- # out-dpo
97
 
98
- This model is a fine-tuned version of [NeverSleep/Noromaid-7B-0.4](https://huggingface.co/NeverSleep/Noromaid-7B-0.4) on the None dataset.
99
 
100
- ## Model description
101
 
102
- More information needed
103
 
104
- ## Intended uses & limitations
105
 
106
- More information needed
107
 
108
- ## Training and evaluation data
109
 
110
- More information needed
 
111
 
112
- ## Training procedure
113
 
114
- ### Training hyperparameters
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
115
 
116
- The following hyperparameters were used during training:
117
- - learning_rate: 2e-08
118
- - train_batch_size: 2
119
- - eval_batch_size: 8
120
- - seed: 42
121
- - gradient_accumulation_steps: 16
122
- - total_train_batch_size: 32
123
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
124
- - lr_scheduler_type: linear
125
- - lr_scheduler_warmup_steps: 10
126
- - training_steps: 626
127
 
128
- ### Training results
 
 
 
129
 
 
130
 
 
131
 
132
- ### Framework versions
133
 
134
- - Transformers 4.37.0.dev0
135
- - Pytorch 2.0.1+cu118
136
- - Datasets 2.16.1
137
- - Tokenizers 0.15.0
 
1
  ---
2
  license: cc-by-nc-4.0
 
 
 
 
 
 
3
  ---
4
 
5
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630dfb008df86f1e5becadc3/VKX2Z2yjZX5J8kXzgeCYO.png)
6
+
7
+
8
+ ---
9
+
10
+
11
+ # This model is a collab between [IkariDev](https://huggingface.co/IkariDev) and [Undi](https://huggingface.co/Undi95)!
12
+
13
+ <!-- description start -->
14
+ ## Description
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
+ <!-- [Recommended settings - contributed by localfultonextractor](https://files.catbox.moe/ue0tja.json) -->
17
 
18
+ This repo contains fp16 files of Noromaid-7b-v0.4-DPO.
19
 
20
+ [FP16 - by IkariDev and Undi](https://huggingface.co/NeverSleep/Noromaid-7B-0.4-DPO)
21
 
22
+ <!-- [GGUF - By TheBloke](https://huggingface.co/TheBloke/Athena-v4-GGUF)-->
23
 
24
+ <!-- [GPTQ - By TheBloke](https://huggingface.co/TheBloke/Athena-v4-GPTQ)-->
25
 
26
+ <!-- [exl2[8bpw-8h] - by AzureBlack](https://huggingface.co/AzureBlack/Echidna-13b-v0.3-8bpw-8h-exl2)-->
27
 
28
+ <!-- [AWQ - By TheBloke](https://huggingface.co/TheBloke/Athena-v4-AWQ)-->
29
 
30
+ <!-- [fp16 - by IkariDev+Undi95](https://huggingface.co/IkariDev/Athena-v4)-->
31
 
32
+ [GGUF - by IkariDev and Undi](https://huggingface.co/NeverSleep/Noromaid-7B-0.4-DPO-GGUF)
33
+ <!-- [OLD(GGUF - by IkariDev+Undi95)](https://huggingface.co/IkariDev/Athena-v4-GGUF)-->
34
 
35
+ ## Ratings:
36
 
37
+ Note: We have permission of all users to upload their ratings, we DONT screenshot random reviews without asking if we can put them here!
38
+
39
+ No ratings yet!
40
+
41
+ If you want your rating to be here, send us a message over on DC and we'll put up a screenshot of it here. DC name is "ikaridev" and "undi".
42
+
43
+ <!-- description end -->
44
+ <!-- prompt-template start -->
45
+
46
+ ## Prompt format: Chatml
47
+ ```
48
+ <|im_start|>system
49
+ {sysprompt}<|im_end|>
50
+ <|im_start|>user
51
+ {input}<|im_end|>
52
+ <|im_start|>assistant
53
+ {output}<|im_end|>
54
+ ```
55
 
56
+ ## Training data used:
57
+ - [no_robots dataset](https://huggingface.co/Undi95/Llama2-13B-no_robots-alpaca-lora) let the model have more human behavior, enhances the output.
58
+ - [Aesir Private RP dataset] New data from a new and never used before dataset, add fresh data, no LimaRP spam, this is 100% new. Thanks to the [MinvervaAI Team](https://huggingface.co/MinervaAI) and, in particular, [Gryphe](https://huggingface.co/Gryphe) for letting us use it!
59
+ - [Another private Aesir dataset]
60
+ - [Another private Aesir dataset]
61
+ - [limarp](https://huggingface.co/datasets/lemonilia/LimaRP)
 
 
 
 
 
62
 
63
+ ## DPO training data used:
64
+ - [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs)
65
+ - [NobodyExistsOnTheInternet/ToxicDPOqa](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicDPOqa)
66
+ - [Undi95/toxic-dpo-v0.1-NoWarning](https://huggingface.co/datasets/Undi95/toxic-dpo-v0.1-NoWarning)
67
 
68
+ This is a full finetune.
69
 
70
+ ## Others
71
 
72
+ Undi: If you want to support me, you can [here](https://ko-fi.com/undiai).
73
 
74
+ IkariDev: Visit my [retro/neocities style website](https://ikaridevgit.github.io/) please kek