Update README.md
Browse files
README.md
CHANGED
@@ -54,3 +54,8 @@ Additional inclusions (random sampled sub-set, cursorily quality-checked) from:
|
|
54 |
- [anthracite-org/nopm_claude_writing_fixed](https://huggingface.co/datasets/anthracite-org/nopm_claude_writing_fixed)
|
55 |
|
56 |
As such, the dataset is not 100% slop free, but this addition likely helps the model be a better roleplayer. At some point, I intend to clean up and release the samples, deslopped.
|
|
|
|
|
|
|
|
|
|
|
|
54 |
- [anthracite-org/nopm_claude_writing_fixed](https://huggingface.co/datasets/anthracite-org/nopm_claude_writing_fixed)
|
55 |
|
56 |
As such, the dataset is not 100% slop free, but this addition likely helps the model be a better roleplayer. At some point, I intend to clean up and release the samples, deslopped.
|
57 |
+
|
58 |
+
Note on training:
|
59 |
+
|
60 |
+
The training was done using [Fine-Tuning with Very Large Dropout](https://arxiv.org/pdf/2403.00946) with a LoRA dropout of 0.5 and a constant learning rate of 4e-6. In addition, the model seemed to retain more of Nemotron's smartness by halving the alpha, which is how this merge (and the LoRA adapter configuration) is set up. (The LoRA was trained with alpha=64, and merged with alpha set to 32.)
|
61 |
+
|