Hellisotherpeople
/

toxic-dpo-v2-mistral7b-lora

Not-For-All-Audiences

Model card Files Files and versions Community

Hellisotherpeople commited on Mar 7

Commit

c84e52b

•

1 Parent(s): 0132716

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ tags:
 # Model Card for toxic-dpo-v2-mistral7b-lora
 This is an example of how easy it is to effeciently fine-tune a SOTA 7B paramater language model adapter (in this case badsai_jaskier-7b-dpo-v5.6 was used, but the lora works on most mistral 7b models)
-to remove "alignment" and "safety" restrictions on a single 3090 GPU with about 12 hours worth of compute.
 The dataset used was https://huggingface.co/datasets/unalignment/toxic-dpo-v0.2
 Trained for 9 epochs

 # Model Card for toxic-dpo-v2-mistral7b-lora
 This is an example of how easy it is to effeciently fine-tune a SOTA 7B paramater language model adapter (in this case badsai_jaskier-7b-dpo-v5.6 was used, but the lora works on most mistral 7b models)
+to remove "alignment" and "safety" restrictions on a single 3090 GPU with about 12 hours worth of compute on a dataset of about 600 examples.
 The dataset used was https://huggingface.co/datasets/unalignment/toxic-dpo-v0.2
 Trained for 9 epochs