crestf411
/

L3.1-nemotron-sunfall-v0.7.0

Text Generation

Not-For-All-Audiences

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

crestf411 commited on Nov 5, 2024

Commit

591e102

•

1 Parent(s): 85b08ab

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -54,3 +54,8 @@ Additional inclusions (random sampled sub-set, cursorily quality-checked) from:
 - [anthracite-org/nopm_claude_writing_fixed](https://huggingface.co/datasets/anthracite-org/nopm_claude_writing_fixed)
 As such, the dataset is not 100% slop free, but this addition likely helps the model be a better roleplayer. At some point, I intend to clean up and release the samples, deslopped.

 - [anthracite-org/nopm_claude_writing_fixed](https://huggingface.co/datasets/anthracite-org/nopm_claude_writing_fixed)
 As such, the dataset is not 100% slop free, but this addition likely helps the model be a better roleplayer. At some point, I intend to clean up and release the samples, deslopped.
+Note on training:
+The training was done using [Fine-Tuning with Very Large Dropout](https://arxiv.org/pdf/2403.00946) with a LoRA dropout of 0.5 and a constant learning rate of 4e-6. In addition, the model seemed to retain more of Nemotron's smartness by halving the alpha, which is how this merge (and the LoRA adapter configuration) is set up. (The LoRA was trained with alpha=64, and merged with alpha set to 32.)