mistral-community
/

mixtral-8x22B-v0.3-original

Model card Files Files and versions Community

conversion to HF

by ehartford - opened May 25

Discussion

ehartford

Unofficial Mistral Community org May 25

I cannot find how to convert this to HF, @v2ray can you please show me the way?

mrfakename

Unofficial Mistral Community org May 25

@ehartford https://github.com/huggingface/transformers/blob/main/src/transformers/models/mixtral/convert_mixtral_weights_to_hf.py

ehartford

Unofficial Mistral Community org May 25

I'm aware of the script.

How to use it to convert 8x22b is far from self evident.

v2ray

Unofficial Mistral Community org May 25

•

edited May 25

@ehartford https://huggingface.co/v2ray/Mixtral-8x22B-v0.1/blob/main/convert.py

python convert.py --input-dir /path/to/original --model-size 22B --output-dir /path/to/save

ehartford

Unofficial Mistral Community org May 25

Thanks!

ehartford

Unofficial Mistral Community org May 25

I will do this immediately

ehartford

Unofficial Mistral Community org May 25

max_position_embeddings = params["max_seq_len"]
                             ~~~~~~^^^^^^^^^^^^^^^

It wants "max_seq_len"

I see there isn't one in params.json

{
    "dim": 6144,
    "n_layers": 56,
    "head_dim": 128,
    "hidden_dim": 16384,
    "n_heads": 48,
    "n_kv_heads": 8,
    "norm_eps": 1e-05,
    "vocab_size": 32768,
    "rope_theta": 1000000.0,
    "moe": {
        "num_experts": 8,
        "num_experts_per_tok": 2
    }
}

I will try setting it to 32768

mrfakename

Unofficial Mistral Community org May 25

I thought it was 64k?

ehartford

Unofficial Mistral Community org May 25

Ok thank you 😊

ehartford

Unofficial Mistral Community org May 25

ok that worked, but didn't create a tokenizer

ehartford

Unofficial Mistral Community org May 25