AttributeError: 'FalconMambaCausalLMOutput' object has no attribute 'past_key_values'

#12

by surak - opened Sep 26

Discussion

surak

Sep 26

Running on FastChat, I get this:

AttributeError: 'FalconMambaCausalLMOutput' object has no attribute 'past_key_values'

ybelkada

Technology Innovation Institute org Sep 26

Hi @surak
Thanks for the issue !
This is because fastchat seems that they did not add support for Falconmamba models. Feel free to raise an issue directly there: https://github.com/lm-sys/FastChat so that they can add the support for FalconMamba architecture

surak

Sep 26

Thanks! The same is true with the latest vLLM:

ERROR 09-26 12:12:14 worker_base.py:464] ValueError: Model architectures ['FalconMambaForCausalLM'] are not supported for now. Supported architectures: ['AquilaModel', 'AquilaForCausalLM', 'BaiChuanForCausalLM', 'BaichuanForCausalLM', 'BloomForCausalLM', 'ChatGLMModel', 'ChatGLMForConditionalGeneration', 'CohereForCausalLM', 'DbrxForCausalLM', 'DeciLMForCausalLM', 'DeepseekForCausalLM', 'DeepseekV2ForCausalLM', 'ExaoneForCausalLM', 'FalconForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTJForCausalLM', 'GPTNeoXForCausalLM', 'InternLMForCausalLM', 'InternLM2ForCausalLM', 'JAISLMHeadModel', 'LlamaForCausalLM', 'LLaMAForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MptForCausalLM', 'MPTForCausalLM', 'MiniCPMForCausalLM', 'MiniCPM3ForCausalLM', 'NemotronForCausalLM', 'OlmoForCausalLM', 'OlmoeForCausalLM', 'OPTForCausalLM', 'OrionForCausalLM', 'PersimmonForCausalLM', 'PhiForCausalLM', 'Phi3ForCausalLM', 'PhiMoEForCausalLM', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'Qwen2VLForConditionalGeneration', 'RWForCausalLM', 'StableLMEpochForCausalLM', 'StableLmForCausalLM', 'Starcoder2ForCausalLM', 'SolarForCausalLM', 'ArcticForCausalLM', 'XverseForCausalLM', 'Phi3SmallForCausalLM', 'MedusaModel', 'EAGLEModel', 'MLPSpeculatorPreTrainedModel', 'JambaForCausalLM', 'GraniteForCausalLM', 'MistralModel', 'Blip2ForConditionalGeneration', 'ChameleonForConditionalGeneration', 'FuyuForCausalLM', 'InternVLChatModel', 'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'LlavaNextVideoForConditionalGeneration', 'LlavaOnevisionForConditionalGeneration', 'MiniCPMV', 'PaliGemmaForConditionalGeneration', 'Phi3VForCausalLM', 'PixtralForConditionalGeneration', 'QWenLMHeadModel', 'UltravoxModel', 'MllamaForConditionalGeneration', 'BartModel', 'BartForConditionalGeneration'] [repeated 2x across cluster]

Which inference engine do you recommend?

ybelkada

Technology Innovation Institute org Sep 27

Hi @surak
There is an ongoing issue to add Falcon Mamba support on vLLM: https://github.com/vllm-project/vllm/issues/7478 I will post a message there to see what is the current status. For other inference engines we do support llama.cpp

hassanraha

Oct 8

Looks like same error with tgi too.

2024-10-08T09:22:10.199681Z ERROR warmup{max_input_length=4095 max_prefill_tokens=4145 max_total_tokens=4096 max_batch_size=None}:warmup: text_generation_router_v3::client: backends/v3/src/client/mod.rs:54: Server error: 'FalconMambaCausalLMOutput' object has no attribute 'past_key_values'
Error: Backend(Warmup(Generation("'FalconMambaCausalLMOutput' object has no attribute 'past_key_values'")))

ybelkada

Technology Innovation Institute org Oct 13

•

edited Oct 13

Falcon Mamba support is on its way to vLLM thanks to @DhiyaEddine :) https://github.com/vllm-project/vllm/pull/9325 !

ybelkada

Technology Innovation Institute org Oct 23

Falcon Mamba is now supported on vLLM - for now you need to install vLLM from source in order to use it with falcon mamba

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment