Are configuration files similar to Qwen2-0.5b?
Are configuration files of Thouph/prompt2tag-qwen2-0.5b-v0.1 similar to Qwen2-0.5b? Or have you changed tokenizer_config.json and other stuff as well?
I am asking, because I was thinking about quantizing the model to GGUF.
I am trying with convert_hf_to_gguf.py in llama.cpp and I am running into
INFO:hf-to-gguf:blk.9.attn_v.weight, torch.float32 --> F16, shape = {896, 128}
INFO:hf-to-gguf:output_norm.weight, torch.float32 --> F32, shape = {896}
Traceback (most recent call last):
File "C:\Prog\Development\Llama.Cpp-Toolbox_3Simplex\Llama.Cpp-Toolbox\llama.cpp\convert_hf_to_gguf.py", line 3953, in <module>
main()
File "C:\Prog\Development\Llama.Cpp-Toolbox_3Simplex\Llama.Cpp-Toolbox\llama.cpp\convert_hf_to_gguf.py", line 3947, in main
model_instance.write()
File "C:\Prog\Development\Llama.Cpp-Toolbox_3Simplex\Llama.Cpp-Toolbox\llama.cpp\convert_hf_to_gguf.py", line 387, in write
self.prepare_tensors()
File "C:\Prog\Development\Llama.Cpp-Toolbox_3Simplex\Llama.Cpp-Toolbox\llama.cpp\convert_hf_to_gguf.py", line 280, in prepare_tensors
for new_name, data in ((n, d.squeeze().numpy()) for n, d in self.modify_tensors(data_torch, name, bid)):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Prog\Development\Llama.Cpp-Toolbox_3Simplex\Llama.Cpp-Toolbox\llama.cpp\convert_hf_to_gguf.py", line 252, in modify_tensors
return [(self.map_tensor_name(name), data_torch)]
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Prog\Development\Llama.Cpp-Toolbox_3Simplex\Llama.Cpp-Toolbox\llama.cpp\convert_hf_to_gguf.py", line 200, in map_tensor_name
raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'score.weight'
I've been using the configuration files from qwen2-0.5b (with a simple change to eos in tokenizer_config.json, which should be irrelevant here) and I changed the model architecture to
"Qwen2ForCausalLM"
],
otherwise the python script would not even start.
This model doesn't use Qwen2ForCausalLM
. It uses Qwen2ForSequenceClassification
, and I'm not sure whether it can be converted to GGUF format. I made one simple architectural change to the model, replacing its autoregressive generation head with text classification head. After that, it might no longer be compatible with text generation oriented frameworks like Llama.cpp.
If you want an example of how to load and run this model with HuggingFace transformer
Python package, you can take a look at this script: https://huggingface.co/spaces/Thouph/prompt2tag-qwen2-0.5b-v0.1-demo/blob/main/app.py
Ok, thank you for your fast responses and explanations!