Is the chat template correct?

#20
by valstu - opened

When I read the documentation on https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_2#-tool-calling-(1b/3b)- I noticed that the chat template differs from what is used in this and Llama 3.2 1B/3B Instruct repositories. Especially the tool calling part is different. In Llama 3.1 the chat template uses JSON based tool calling but on the Llama 3.2 documentation tool calling seem to use this function style signature. Although the 3.2 vision models still use the old syntax.

And here is the example from Llama 3.2 1B/3B model card:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>
Questions: Can you retrieve the details for the user with the ID 7890, who has black as their special request?
Here is a list of functions in JSON format that you can invoke:
[
    {
        "name": "get_user_info",
        "description": "Retrieve details for a specific user by their unique identifier. Note that the provided function is in Python 3 syntax.",
        "parameters": {
            "type": "dict",
            "required": [
                "user_id"
            ],
            "properties": {
                "user_id": {
                "type": "integer",
                "description": "The unique identifier of the user. It is used to fetch the specific user details from the database."
            },
            "special": {
                "type": "string",
                "description": "Any special information or parameters that need to be considered while fetching user details.",
                "default": "none"
                }
            }
        }
    }
]
Should you decide to return the function call(s), Put it in the format of [func1(params_name=params_value, params_name2=params_value2...), func2(params)]
NO other text MUST be included.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
[get_user_info(user_id=7890, special='black')]<|eot_id|>

Here's an example from Llama 3.1 Model card

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the orginal use question.<|eot_id|><|start_header_id|>user<|end_header_id|>

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.

{
    "type": "function",
    "function": {
    "name": "get_current_conditions",
    "description": "Get the current weather conditions for a specific location",
    "parameters": {
        "type": "object",
        "properties": {
        "location": {
            "type": "string",
            "description": "The city and state, e.g., San Francisco, CA"
        },
        "unit": {
            "type": "string",
            "enum": ["Celsius", "Fahrenheit"],
            "description": "The temperature unit to use. Infer this from the user's location."
        }
        },
        "required": ["location", "unit"]
    }
    }
}

Question: what is the weather like in San Fransisco?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{"name": "get_current_conditions", "parameters": {"location": "San Francisco, CA", "unit": "Fahrenheit"}}<|eot_id|><|start_header_id|>ipython<|end_header_id|>

{"output": "Clouds giving way to sun Hi: 76° Tonight: Mainly clear early, then areas of low clouds forming Lo: 56°"}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
The weather in Menlo Park is currently cloudy with a high of 76° and a low of 56°, with clear skies expected tonight.<eot_id>

Also another thing to note is that the Llama 3.2 documentation does not have examples on how to pass the function call responses back to the model but this might be due to the reason that the model cannot handle multi turn function calling.

@pcuenq do you have an idea which template is correct?

From my search and analysis, it seems,

  1. For Llama 3.2 1B/3B Instruct models, the recommended chat template is still the same as used in Llama 3.1, which includes the JSON-based tool calling.
    The function-style tool calling described in the Llama 3.2 documentation seems to be applicable to specific models or use cases and may not be supported by the Llama 3.2 1B/3B Instruct models.

  2. The Llama 3.2 1B/3B Instruct models are lightweight models and may have limitations compared to larger models like Llama 3.1's 8B/70B/405B models.
    Multi-turn function calling and advanced tool use may not be fully supported in the 1B/3B models, as indicated by the documentation and community discussions.
    Aligning Templates with Model Training:

Since the model card and repositories for the Llama 3.2 1B/3B Instruct models use the JSON-based tool calling in their examples, it's advisable to use that template for your instruction fine-tuning.

@pcuenq Am I correct in my assessment?

@olabs-ai seems like the templates might have slightly changed on llama.com since I originally posted this. But still my concern is that should we expect the 3.2 Instruct models to answer with function style tool calling
[get_user_info(user_id=7890, special='black')]
or json style
{name: "get_user_info", parameters: {user_id: 7890, special: "black"}}
Since this effects how I should construct my fine tune data.

Meta Llama org

I had a few questions after running my own tests. TL;DR: tool calling works, but it's very sensitive to the chat template, which may require some changes we are in the process of testing.

Also cc @ariG23498 who's working on it as well.

Sign up or log in to comment