bartowski/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF · Q6_K Output is just exclamation marks, e.g., !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

22 days ago

•

Hi,

although I am sticking to the prompt template, my output contains only exclamation marks:
I am using Llama-3.1-Nemotron-70B-Instruct-HF-Q6_K.gguf, I merged the files using llama.cpp to be compatible for vllm like this

./llama-gguf-split --merge /models/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF/Llama-3.1-Nemotron-70B-Instruct-HF-Q6_K/Llama-3.1-Nemotron-70B-Instruct-HF-Q6_K-00001-of-00002.gguf Llama-3.1-Nemotron-70B-Instruct-HF-Q6_K.gguf

This is my input

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

you are helpful ai assistant<|eot_id|><|start_header_id|>user<|end_header_id|>

how many r in strawberry?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

The model's output is

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Any advice?

Thank you in advance
Best regards

bartowski

Owner 22 days ago

Hmm everything looks reasonable.. feels like a rope setting is off or the prompt isn't going through properly.. is there a way to show the full input/output of the model to make sure you're not accidentally giving the system prompt twice or something silly?

paolovic

21 days ago

Hi @bartowski ,
sorry for let you waiting, yes I'll provide it asap.
Best regards

paolovic

20 days ago

Hi @bartowski ,

I logged it out

INFO 2024-10-24 11:38:20,276 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:274 - prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful AI assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

give me something else than exclamation marks<|eot_id|><|start_header_id|>assistant<|end_header_id|>


INFO 2024-10-24 11:38:20,631 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:193 - !
INFO 2024-10-24 11:38:20,785 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:193 - !
INFO 2024-10-24 11:38:21,068 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:193 - !
INFO 2024-10-24 11:38:21,124 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:193 - !
INFO 2024-10-24 11:38:21,180 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:193 - !
INFO 2024-10-24 11:38:21,236 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:193 - !
INFO 2024-10-24 11:38:21,292 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:193 - !
INFO 2024-10-24 11:38:21,347 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:193 - !
INFO 2024-10-24 11:38:21,403 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:193 - !
INFO 2024-10-24 11:38:21,459 LLaMA 3.1 70B vllmAPI ipru2wih 123456 llama70b/chat/completions/ vllm.py:193 - !
...
<4000 lines of "!">

bartowski

Owner 19 days ago

fascinating.. is there any debug info on the VLLM side? seems likely it's a VLLM issue unfortunately.. any chance you can try without the prompt formatting, in case VLLM is applying it on its own?

Like maybe, pure speculation, giving it the entire prompt like that is translating to:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful AI assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful AI assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

give me something else than exclamation marks<|eot_id|><|start_header_id|>assistant<|end_header_id|><|eot_id|><|start_header_id|>assistant<|end_header_id|>

which is making it act up

I'm downloading locally to double check in llama.cpp but i assume it's working there since no one else has reported any issues

paolovic

19 days ago

I could try, but FYI Llama-3.1-Nemotron-70B-Instruct-HF-Q5_K_S.gguf works.
Same chat template, same everything, nothing changed but the path to the model

bartowski

Owner 19 days ago

Oh that's very interesting then.. almost like the merge borked it.. I tried a smaller model, i'll try Q6 specifically in case that one alone is messed up, then i'll merge it and try again to see if it's VLLM

bartowski

Owner 17 days ago

Finally tried it out with the merge, and it still produces perfectly coherently, so I guess it's not Q6 and not the merge process :(

bartowski

Owner 17 days ago

Actually, can you give the sha256sum of your merge? mine is b9ca98d65c7ae0717bbe9e93b9408b0b4d64a856046d5c828fa6237e3d12cd6e

paolovic

16 days ago

Hi @bartowski ,
thanks for your efforts
Mine is a4439e189c6c4107792c8dec0713c8c7f90df00fe227dba6ce4b9dfdb0d9b035
Seems like, although the merge went smoothly, that my file is corrupted.
I wonder how this could happen.
But anyway, thank you very much!