vocab size mismatch

by mradermacher - opened Jul 7

Jul 7

config.json says vocab_size is 32000 while instruct uses 32768. it also doesn't load properly. I don't think the vocab size should differ, and I suspect the size should be set to 32768 just like in the instruct model (which does load).

lbathen

Jul 9

•

edited Jul 9

The problem is that the vocab size is different. The model state dict is actually 32000 as in the config.

fuweiy

Jul 27

Hi Mixtral team, when loading Mixtral-8x22B with AutoTokenizer, the vocab_size gets 32768, however, the config.json shows vocab_size is 32000, and the embedding layer has a shape of 32000x6144, why there's a mismatch? How to deal with the mismatch?

tflsxyy

30 days ago

Hi Mistral team! The current Mixtral-8x22B-v0.1 has tokenizer mismatch issue. I can load the tokenizer from Mixtral-8x7B-v0.1 to solve the problem, but when will the repo updated just like current Mixtral-8x22B-Instruct-v0.1 which now has correct tokenizer? Thanks!

pandora-s

Mistral AI_ org 30 days ago

I've merged a revision that should fix this issue!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment