3B Version Weights

by TKDKid1000 - opened Oct 17

Oct 17

Will we ever get the weights for the 3B version or will Mistral just be releasing those of the 8B version? The whole point of these tiny models is to run them locally on device, and from reading the Mistral news article on this topic, it seems like they're only offering the 3B through cloud services. This just seems really unusual to me. Am I missing something here?

Delta-Vector

Oct 17

•

edited Oct 17

Not everything needs to be Local. lol (MDF is a moderator of Mistral AI Discord and was asked about Open weights for the 3B)

vinventive

Oct 19

Why would MistralAI limit access to Mistral 3B to an API only when much superior GPT-4o-mini closed-weights model is available through API with text and vision input on all edge devices? What is the leverage? I'm confused. The price for Mistral 3B API access doesn't appear profitable anyway. What's preventing them from releasing the open weights? Is MistralAI concerned about independent benchmarks results? I hope they will eventually release open weights even if they are not performing as well as presented in their own benchmarks.

TKDKid1000

Oct 19

What's odd is that this is, so far, basically the only model that Mistral hasn't open-sourced. The others were Medium and Large v1, but we got replacements for those in 8x22B and Large v2. It's not like some 3B model is dangerous or anything... right?

Delta-Vector

Oct 19

Mistral might think their 3B is so good that if they release weights, their API would be dead or something?

TKDKid1000

Oct 19

It's not even that crazy of a model. From their benchmarks (and who knows how accurate they may be) of what I assume to be the instruction-tuned model, 3B seems to fall somewhere in between Llama 3.2 3B and Qwen2.5 3B, so it isn't even that crazy. It certainly wouldn't ruin the need for their API.

vinventive

11 days ago

Mistral might think their 3B is so good that if they release weights, their API would be dead or something?

If the model truly was 3B and performed as well as shown in the benchmarks, they would have open-sourced it. It’s more likely that the model is larger than 3B, but they’ve chosen to market it that way, and we nobody can confirm anything through the API. I can’t think of any better explanation for why they’re being so restrictive about this little guy. I doubt this is about avoiding competition with Microsoft's own 3B models, that doesn't make sense to me. It's more like they don't want to show Microsoft they can't release good models at this size yet.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment