medmekk HF staff commited on
Commit
009a6fe
1 Parent(s): 9be9140

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -81,7 +81,7 @@ The model was trained on a subset of [FineWeb-edu](https://huggingface.co/datase
81
  - Activations quantized to 8-bit precision
82
 
83
  10. **Key Findings**
84
- - Warmup quantization (linear lambda scheduler) proved crucial for performance
85
 
86
  These 10B token training runs showed that it's possible to effectively fine-tune pre-trained models to 1.58-bit precision, achieving strong performance with relatively limited additional training data.
87
 
 
81
  - Activations quantized to 8-bit precision
82
 
83
  10. **Key Findings**
84
+ - Warmup quantization (sigmoid or linear lambda scheduler) proved crucial for performance
85
 
86
  These 10B token training runs showed that it's possible to effectively fine-tune pre-trained models to 1.58-bit precision, achieving strong performance with relatively limited additional training data.
87