Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face
β’
11
Nice blog!
@osanseviero
we have been doing this in TGI and TEI for a while ;)
Padding free implementations also make dynamic batching easier to implement and more predictable in memory.