Model not functioning as expected

#2
by gitkaz - opened

Hello, Thank you for converting this model.

However, this is not working correctly in my environment. The model loads without issue, but the generate function outputs meaningless strings. Below, I have pasted the current version of mlx_lm and the output of mlx_lm.generate.

Is it working well in your environments?

% pip show mlx_lm                 
Name: mlx-lm
Version: 0.18.1
Summary: LLMs on Apple silicon with MLX and the Hugging Face Hub
Home-page: https://github.com/ml-explore/mlx-examples
Author: MLX Contributors
Author-email: mlx@group.apple.com
License: MIT
Location: /Volumes/CT4000P3PSSD8JP/github/mlx_gguf_server/.venv/lib/python3.12/site-packages
Requires: jinja2, mlx, numpy, protobuf, pyyaml, transformers
Required-by: 

% python -m mlx_lm.generate --model models/mlx-community/c4ai-command-r-plus-08-2024-4bit  --prompt "hello"
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
==========
Prompt: <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>hello<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>
 Smir thru ope hitch Tomo seventy seventy seventy seventy seventy seventy seventy etnic Smir thru relanç勾勾 meis stump ope¹ relanç thru gib gib gib thru seventy seventy meis rééd扭 relanç扭모리 Smir Smir gib Laff Laff Laff Laff rééd etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic etnic
==========
Prompt: 8 tokens, 6.411 tokens-per-sec
Generation: 100 tokens, 10.814 tokens-per-sec
Peak memory: 54.463 GB

Thank you,

MLX Community org

Thank you for reaching out about the model conversion issue. I'm experiencing a similar situation in my environment as well. The model loads without problems, but the generate function produces meaningless output, just as you described.
Unfortunately, I haven't been able to identify the root cause of this problem yet. I'm currently working on potential fixes, but so far without success.
If anyone else has encountered this issue and found a solution, or has any additional insights, please share them with the community. Any information that could help us resolve this would be greatly appreciated.

Sign up or log in to comment