problems when use mistral-chat
#24
by
jinglishi0206
- opened
HOW TO REPRODUCE
- follow the step download & install
mistral-inference
- download models
- RUN
mistral-chat $HOME/mistral_models/7B-Instruct-v0.3 --instruct --max_tokens 256
Output:
`decoderF` is not supported because:
xFormers wasn't build with CUDA support
attn_bias type is <class 'xformers.ops.fmha.attn_bias.BlockDiagonalCausalLocalAttentionMask'>
operator wasn't built - see `python -m xformers.info` for more info
`flshattF@0.0.0` is not supported because:
xFormers wasn't build with CUDA support
operator wasn't built - see `python -m xformers.info` for more info
`cutlassF` is not supported because:
xFormers wasn't build with CUDA support
operator wasn't built - see `python -m xformers.info` for more info
`smallkF` is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
xFormers wasn't build with CUDA support
dtype=torch.bfloat16 (supported: {torch.float32})
attn_bias type is <class 'xformers.ops.fmha.attn_bias.BlockDiagonalCausalLocalAttentionMask'>
operator wasn't built - see `python -m xformers.info` for more info
unsupported embed per head: 128
On the package page https://github.com/mistralai/mistral-inference, it says that you need to have a GPU to install it, because it also installs the 'xformers' package which needs a GPU, probably one that has cores CUDA.