google/gemma-2b · How to set `tokenizer.chat_template` to an appropriate template using Gemma-2b

Mar 7

Anyone else having problems defining the chat template for the tokenizer?

I've been trying for 2 days and the following error only occurs:

"No chat template is defined for this tokenizer - using a default chat template that implements the ChatML format (without BOS/EOS tokens!). If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information."

Code used:

from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model_id = "google/gemma-2b"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
token=""
)

chat = [ { "role": "user", "content": "Write a hello world program" } ]

prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=150)

print(outputs)

anihm136

Google org Mar 7

I believe the chat template is only applicable for the instruction tuned versions, as those have been trained with a specific template. The base models are not specifically trained with any template

Wigenes

Mar 8

@anihm136 Tks for the help. I'll try and notice here about the result. Kudos

Wigenes

Mar 11

@anihm136 You were correct. I used the google/gemma-2b-it model and it worked perfectly. Hugs

Wigenes changed discussion status to closed Mar 11