GGUF version please

by Hoioi - opened Feb 9

Discussion

Hoioi

Feb 9

Could you please release the GGUF version of this model?

brucethemoose

Owner Feb 9

•

edited Feb 9

Yeah. I am busy today but will kick off the imatrix quantization tonight, I have been meaning to mess with that anyway.

Hoioi

Feb 9

That's great. I'm waiting for that.

Hoioi

Feb 9

Please release Q5_K_M and q4_k_m too if that's possible.

brucethemoose

Owner Feb 9

Yeah they will all be imatrixed

MarsupialAI

Feb 10

Trying to convert to GGUF but it's missing a tokenizer.model file. Using the one from regular Yi leads to other errors.

brucethemoose

Owner Feb 10

•

edited Feb 10

Still doing this, but I literally fell asleep on my keyboard, lol.

I think I know how to generate a tokenizer as well, let's see

Hoioi

Feb 10

•

edited Feb 10

😁. I'm really looking forward to it.

brucethemoose

Owner Feb 10

I'm kinda stumped tbh, if I run:

from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("/home/alpha/Models/Raw/RPmerge/")
tok.save_pretrained("/home/alpha/Models/Raw/temp/", legacy_format=True)

There is still not an option to output a tokenizer.model. Currently tryint to trace back and see how it's even generated.

brucethemoose

Owner Feb 10

python convert.py /home/alpha/Models/Raw/RPmerge/ --vocab-only --vocab-type hfft --outfile tokenizer.model Seems to work? I will quantize and see if it actually does.

Hoioi

Feb 10

I'm waiting for the results.. 😊

MarsupialAI

Feb 10

GGUFs uploading now: https://huggingface.co/MarsupialAI/Yi-34B-200K-RPMerge_GGUF

lastrosade

Feb 10

•

edited Feb 10

Do think of adding the Orca-Vicuna chat template to tokenize_config.json:
"chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{message['role'] + ' :' + message['content'] + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ 'ASSISTANT: ' }}{% endif %}",

I'll make my own quants just in case.

brucethemoose

Owner Feb 10

•

edited Feb 10

Yeah I figured it out as well, making some imatrix quants

brucethemoose

Owner Feb 10

•

edited Feb 10

Do think of adding the Orca-Vicuna chat template to tokenize_config.json:
"chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{message['role'] + ' :' + message['content'] + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ 'ASSISTANT: ' }}{% endif %}",

I'll make my own quants just in case.

This is a good idea.

Is this template correct though? I don't see anything that adds the USER: or SYSTEM: message.

brucethemoose

Owner Feb 10

•

edited Feb 10

Uploading now: https://huggingface.co/brucethemoose/Yi-34B-200K-RPMerge-iMat.GGUF

Also:

GGUFs uploading now: https://huggingface.co/MarsupialAI/Yi-34B-200K-RPMerge_GGUF

lastrosade

Feb 11

•

edited Feb 11

Is this template correct though? I don't see anything that adds the USER: or SYSTEM: message.

{message['role'] + ' : ' + message['content'] + '\n'}

role USER, SYSTEM or ASSISTANT.

{% if add_generation_prompt %}{{ 'ASSISTANT: ' }}{% endif %}

Adds 'ASSISTANT:' when you only send the history.

Also there's an error and it should be ASSISTANT: and message['role'] + ': ' + message['content']

Whoops

Hoioi

Feb 11

Thank you so much for all your hard work. I hope really appreciate it.

Hoioi changed discussion status to closed Feb 11

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment