3B Version Weights
Will we ever get the weights for the 3B version or will Mistral just be releasing those of the 8B version? The whole point of these tiny models is to run them locally on device, and from reading the Mistral news article on this topic, it seems like they're only offering the 3B through cloud services. This just seems really unusual to me. Am I missing something here?
Why would MistralAI limit access to Mistral 3B to an API only when much superior GPT-4o-mini closed-weights model is available through API with text and vision input on all edge devices? What is the leverage? I'm confused. The price for Mistral 3B API access doesn't appear profitable anyway. What's preventing them from releasing the open weights? Is MistralAI concerned about independent benchmarks results? I hope they will eventually release open weights even if they are not performing as well as presented in their own benchmarks.
What's odd is that this is, so far, basically the only model that Mistral hasn't open-sourced. The others were Medium and Large v1, but we got replacements for those in 8x22B and Large v2. It's not like some 3B model is dangerous or anything... right?
Mistral might think their 3B is so good that if they release weights, their API would be dead or something?
It's not even that crazy of a model. From their benchmarks (and who knows how accurate they may be) of what I assume to be the instruction-tuned model, 3B seems to fall somewhere in between Llama 3.2 3B and Qwen2.5 3B, so it isn't even that crazy. It certainly wouldn't ruin the need for their API.
Mistral might think their 3B is so good that if they release weights, their API would be dead or something?
If the model truly was 3B and performed as well as shown in the benchmarks, they would have open-sourced it. It’s more likely that the model is larger than 3B, but they’ve chosen to market it that way, and we nobody can confirm anything through the API. I can’t think of any better explanation for why they’re being so restrictive about this little guy. I doubt this is about avoiding competition with Microsoft's own 3B models, that doesn't make sense to me. It's more like they don't want to show Microsoft they can't release good models at this size yet.