Upscaled models using the Block Expansion method. Unlike the more common DUP Scaling, BE doesn't require fine-tuning to recover lost performance.
-
Pretergeek/OpenChat-3.5-0106_8.11B_36Layers-Interleaved
Text Generation • Updated • 19 • 2 -
Pretergeek/OpenChat-3.5-0106_8.99B_40Layers-Interleaved
Text Generation • Updated • 24 • 2 -
Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved
Text Generation • Updated • 15 • 2 -
Pretergeek/OpenChat-3.5-0106_8.11B_36Layers-Appended
Text Generation • Updated • 29 • 2