Issues while deploying on AWS SageMaker with TGI

by rajaswa-postman - opened Sep 7, 2023

Sep 7, 2023

I've been trying to deploy codellama/CodeLlama-13b-Instruct-hf on AWS SageMaker with the TGI container for a while now. I am facing two issues in particular -

The tokenizer class mismatch -

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'CodeLlamaTokenizer'. 
The class this function is called from is 'LlamaTokenizer'.

Model loading error with TGI -

File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 142, in serve_inner model = get_model( File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/__init__.py", line 185, in get_model return FlashLlama( File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/flash_llama.py", line 65, in __init__ model = FlashLlamaForCausalLM(config, weights) File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/custom_modeling/flash_llama_modeling.py", line 452, in __init__ self.model = FlashLlamaModel(config, weights) File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/custom_modeling/flash_llama_modeling.py", line 390, in __init__ [ File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/custom_modeling/flash_llama_modeling.py", line 391, in <listcomp> FlashLlamaLayer( File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/custom_modeling/flash_llama_modeling.py", line 326, in __init__ self.self_attn = FlashLlamaAttention( File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/custom_modeling/flash_llama_modeling.py", line 183, in __init__ self.rotary_emb = PositionRotaryEmbedding.load( File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/layers.py", line 395, in load inv_freq = weights.get_tensor(f"{prefix}.inv_freq") File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/weights.py", line 62, in get_tensor filename, tensor_name = self.get_filename(tensor_name) File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/weights.py", line 49, in get_filename raise RuntimeError(f"weight {tensor_name} does not exist")
RuntimeError: weight model.layers.0.self_attn.rotary_emb.inv_freq does not exist

Any idea about how these can be resolved?

I have tried using the latest transformers version - 4.33.1 as well.

lvwerra

Code Llama org Sep 7, 2023

cc @philschmid

d4niel92

Sep 9, 2023

•

edited Sep 16, 2023

Same issue here ... Any help would be greatly appreciated

I already tried to pip install different transformer versions, but none of them was able to fix the problem.

!pip install git+https://github.com/huggingface/transformers.git@main
!pip install git+https://github.com/ArthurZucker/transformers.git@main
!pip install git+https://github.com/ArthurZucker/transformers.git@add-llama-code

ArthurZ

Code Llama org Sep 18, 2023

You should only need pip install git+https://github.com/huggingface/transformers.git@main my branch was just for developpement

Narsil

Code Llama org Sep 18, 2023

This warning is safe to ignore.

Both tokenizer are the same (for TGI purposes) as TGI doesn't use the codellama in code capabilities, you would need to send the preprompt yourself.
For the missing inv_freq codellama's weights didn't include those (essentially it's llamav2) and old TGI versions expected inv_freq to be present.

This should all be solved with the upcoming Sagemaker release of latest TGI.

d4niel92

Sep 19, 2023

Thanks for your reply, @Narsil ! Any information on when the upcoming Sagemaker release of the latest TGI will be available?

Narsil

Code Llama org Sep 19, 2023

Soon I hope, but I can't make any promises (it's not in our hands at this point)

philschmid

Code Llama org Sep 19, 2023

1.0.3 is now available on SageMaker.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment