Safetensors
Not-For-All-Audiences
xzuyn commited on
Commit
3a3c420
1 Parent(s): 0d159f9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -8,6 +8,8 @@ tags:
8
  ---
9
  Trained on [NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal). I converted the set to a preference dataset using refusals generated from LLaMa-3-Instruct-8B. I have not recreated the rejections with LLaMa-3.1-Instruct-8B yet.
10
 
 
 
11
  ![train/rewards](https://huggingface.co/PJMixers/LLaMa-3.1-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/images/rewards.png)
12
  ![train/logits](https://huggingface.co/PJMixers/LLaMa-3.1-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/images/logits.png)
13
  ![train/logps](https://huggingface.co/PJMixers/LLaMa-3.1-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/images/logps.png)
 
8
  ---
9
  Trained on [NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal). I converted the set to a preference dataset using refusals generated from LLaMa-3-Instruct-8B. I have not recreated the rejections with LLaMa-3.1-Instruct-8B yet.
10
 
11
+ This is an early test.
12
+
13
  ![train/rewards](https://huggingface.co/PJMixers/LLaMa-3.1-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/images/rewards.png)
14
  ![train/logits](https://huggingface.co/PJMixers/LLaMa-3.1-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/images/logits.png)
15
  ![train/logps](https://huggingface.co/PJMixers/LLaMa-3.1-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/images/logps.png)