File size: 17,035 Bytes
54b2cf8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 |
---
extra_gated_heading: >-
Acknowledge to follow corresponding license to access the
repository
extra_gated_button_content: Agree and access repository
extra_gated_fields:
First Name: text
Last Name: text
Country: country
Affiliation: text
license: cc-by-nc-4.0
datasets:
- Salesforce/xlam-function-calling-60k
language:
- en
pipeline_tag: text-generation
tags:
- function-calling
- LLM Agent
- tool-use
- mistral
- pytorch
---
![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)
# QuantFactory/xLAM-7b-r-GGUF
This is quantized version of [Salesforce/xLAM-7b-r](https://huggingface.co/Salesforce/xLAM-7b-r) created using llama.cpp
# Original Model Card
<p align="center">
<img width="500px" alt="xLAM" src="https://huggingface.co/datasets/jianguozhang/logos/resolve/main/xlam-no-background.png">
</p>
<p align="center">
<a href="">[Homepage]</a> |
<a href="">[Paper]</a> |
<a href="https://github.com/SalesforceAIResearch/xLAM">[Github]</a>
</p>
<hr>
Welcome to the xLAM model family! [Large Action Models (LAMs)](https://blog.salesforceairesearch.com/large-action-models/) are advanced large language models designed to enhance decision-making and translate user intentions into executable actions that interact with the world. LAMs autonomously plan and execute tasks to achieve specific goals, serving as the brains of AI agents. They have the potential to automate workflow processes across various domains, making them invaluable for a wide range of applications.
**The model release is exclusively for research purposes. A new and enhanced version of xLAM will soon be available exclusively to customers on our Platform.**
## Table of Contents
- [Model Series](#model-series)
- [Repository Overview](#repository-overview)
- [Benchmark Results](#benchmark-results)
- [Usage](#usage)
- [Basic Usage with Huggingface](#basic-usage-with-huggingface)
- [License](#license)
- [Citation](#citation)
## Model Series
We provide a series of xLAMs in different sizes to cater to various applications, including those optimized for function-calling and general agent applications:
| Model | # Total Params | Context Length | Download Model | Download GGUF files |
|------------------------|----------------|----------------|----------------|----------|
| xLAM-1b-fc-r | 1.35B | 16k | [π€ Link](https://huggingface.co/Salesforce/xLAM-1b-fc-r) | [π€ Link](https://huggingface.co/Salesforce/xLAM-1b-fc-r-gguf) |
| xLAM-7b-fc-r | 6.91B | 4k | [π€ Link](https://huggingface.co/Salesforce/xLAM-7b-fc-r) | [π€ Link](https://huggingface.co/Salesforce/xLAM-7b-fc-r-gguf) |
| xLAM-7b-r | 7.24B | 32k | [π€ Link](https://huggingface.co/Salesforce/xLAM-7b-r) | -- |
| xLAM-8x7b-r | 46.7B | 32k | [π€ Link](https://huggingface.co/Salesforce/xLAM-8x7b-r) | -- |
| xLAM-8x22b-r | 141B | 64k | [π€ Link](https://huggingface.co/Salesforce/xLAM-8x22b-r) | -- |
For our Function-calling series (more details are included at [here](https://huggingface.co/Salesforce/xLAM-7b-fc-r)), we also provide their quantized [GGUF](https://huggingface.co/docs/hub/en/gguf) files for efficient deployment and execution. GGUF is a file format designed to efficiently store and load large language models, making GGUF ideal for running AI models on local devices with limited resources, enabling offline functionality and enhanced privacy.
For more details, check our [GitHub](https://github.com/SalesforceAIResearch/xLAM) and [paper]().
## Repository Overview
This repository is about the general tool use series. For more specialized function calling models, please take a look into our `fc` series [here](https://huggingface.co/Salesforce/xLAM-7b-fc-r).
The instructions will guide you through the setup, usage, and integration of our model series with HuggingFace.
### Framework Versions
- Transformers 4.41.0
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1
## Usage
### Basic Usage with Huggingface
To use the model from Huggingface, please first install the `transformers` library:
```bash
pip install transformers>=4.41.0
```
Please note that, our model works best with our provided prompt format.
It allows us to extract JSON output that is similar to the [function-calling mode of ChatGPT](https://platform.openai.com/docs/guides/function-calling).
We use the following example to illustrate how to use our model for 1) single-turn use case, and 2) multi-turn use case
#### 1. Single-turn use case
````python
import json
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
torch.random.manual_seed(0)
model_name = "Salesforce/xLAM-7b-r"
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Please use our provided instruction prompt for best performance
task_instruction = """
Based on the previous context and API request history, generate an API request or a response as an AI assistant.""".strip()
format_instruction = """
The output should be of the JSON format, which specifies a list of generated function calls. The example format is as follows, please make sure the parameter type is correct. If no function call is needed, please make
tool_calls an empty list "[]".
```
{"thought": "the thought process, or an empty string", "tool_calls": [{"name": "api_name1", "arguments": {"argument1": "value1", "argument2": "value2"}}]}
```
""".strip()
# Define the input query and available tools
query = "What's the weather like in New York in fahrenheit?"
get_weather_api = {
"name": "get_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, New York"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit of temperature to return"
}
},
"required": ["location"]
}
}
search_api = {
"name": "search",
"description": "Search for information on the internet",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query, e.g. 'latest news on AI'"
}
},
"required": ["query"]
}
}
openai_format_tools = [get_weather_api, search_api]
# Helper function to convert openai format tools to our more concise xLAM format
def convert_to_xlam_tool(tools):
''''''
if isinstance(tools, dict):
return {
"name": tools["name"],
"description": tools["description"],
"parameters": {k: v for k, v in tools["parameters"].get("properties", {}).items()}
}
elif isinstance(tools, list):
return [convert_to_xlam_tool(tool) for tool in tools]
else:
return tools
def build_conversation_history_prompt(conversation_history: str):
parsed_history = []
for step_data in conversation_history:
parsed_history.append({
"step_id": step_data["step_id"],
"thought": step_data["thought"],
"tool_calls": step_data["tool_calls"],
"next_observation": step_data["next_observation"],
"user_input": step_data['user_input']
})
history_string = json.dumps(parsed_history)
return f"\n[BEGIN OF HISTORY STEPS]\n{history_string}\n[END OF HISTORY STEPS]\n"
# Helper function to build the input prompt for our model
def build_prompt(task_instruction: str, format_instruction: str, tools: list, query: str, conversation_history: list):
prompt = f"[BEGIN OF TASK INSTRUCTION]\n{task_instruction}\n[END OF TASK INSTRUCTION]\n\n"
prompt += f"[BEGIN OF AVAILABLE TOOLS]\n{json.dumps(xlam_format_tools)}\n[END OF AVAILABLE TOOLS]\n\n"
prompt += f"[BEGIN OF FORMAT INSTRUCTION]\n{format_instruction}\n[END OF FORMAT INSTRUCTION]\n\n"
prompt += f"[BEGIN OF QUERY]\n{query}\n[END OF QUERY]\n\n"
if len(conversation_history) > 0: prompt += build_conversation_history_prompt(conversation_history)
return prompt
# Build the input and start the inference
xlam_format_tools = convert_to_xlam_tool(openai_format_tools)
conversation_history = []
content = build_prompt(task_instruction, format_instruction, xlam_format_tools, query, conversation_history)
messages=[
{ 'role': 'user', 'content': content}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
# tokenizer.eos_token_id is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
agent_action = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
````
Then you should be able to see the following output string in JSON format:
```shell
{"thought": "I need to get the current weather for New York in fahrenheit.", "tool_calls": [{"name": "get_weather", "arguments": {"location": "New York", "unit": "fahrenheit"}}]}
```
#### 2. Multi-turn use case
We also support multi-turn interaction with our model series. Here is the example of next round of interaction from the above example:
````python
def parse_agent_action(agent_action: str):
"""
Given an agent's action, parse it to add to conversation history
"""
try: parsed_agent_action_json = json.loads(agent_action)
except: return "", []
if "thought" not in parsed_agent_action_json.keys(): thought = ""
else: thought = parsed_agent_action_json["thought"]
if "tool_calls" not in parsed_agent_action_json.keys(): tool_calls = []
else: tool_calls = parsed_agent_action_json["tool_calls"]
return thought, tool_calls
def update_conversation_history(conversation_history: list, agent_action: str, environment_response: str, user_input: str):
"""
Update the conversation history list based on the new agent_action, environment_response, and/or user_input
"""
thought, tool_calls = parse_agent_action(agent_action)
new_step_data = {
"step_id": len(conversation_history) + 1,
"thought": thought,
"tool_calls": tool_calls,
"step_id": len(conversation_history),
"next_observation": environment_response,
"user_input": user_input,
}
conversation_history.append(new_step_data)
def get_environment_response(agent_action: str):
"""
Get the environment response for the agent_action
"""
# TODO: add custom implementation here
error_message, response_message = "", ""
return {"error": error_message, "response": response_message}
# ------------- before here are the steps to get agent_response from the example above ----------
# 1. get the next state after agent's response:
# The next 2 lines are examples of getting environment response and user_input.
# It is depended on particular usage, we can have either one or both of those.
environment_response = get_environment_response(agent_action)
user_input = "Now, search on the Internet for cute puppies"
# 2. after we got environment_response and (or) user_input, we want to add to our conversation history
update_conversation_history(conversation_history, agent_action, environment_response, user_input)
# 3. we now can build the prompt
content = build_prompt(task_instruction, format_instruction, xlam_format_tools, query, conversation_history)
# 4. Now, we just retrieve the inputs for the LLM
messages=[
{ 'role': 'user', 'content': content}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
# 5. Generate the outputs & decode
# tokenizer.eos_token_id is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
agent_action = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
````
This would be the corresponding output:
```shell
{"thought": "I need to get the current weather for New York in fahrenheit.", "tool_calls": [{"name": "get_weather", "arguments": {"location": "New York", "unit": "fahrenheit"}}]}
```
We highly recommend to use our provided prompt format and helper functions to yield the best function-calling performance of our model.
#### Example multi-turn prompt and output
Prompt:
````json
[BEGIN OF TASK INSTRUCTION]
Based on the previous context and API request history, generate an API request or a response as an AI assistant.
[END OF TASK INSTRUCTION]
[BEGIN OF AVAILABLE TOOLS]
[
{
"name": "get_fire_info",
"description": "Query the latest wildfire information",
"parameters": {
"location": {
"type": "string",
"description": "Location of the wildfire, for example: 'California'",
"required": true,
"format": "free"
},
"radius": {
"type": "number",
"description": "The radius (in miles) around the location where the wildfire is occurring, for example: 10",
"required": false,
"format": "free"
}
}
},
{
"name": "get_hurricane_info",
"description": "Query the latest hurricane information",
"parameters": {
"name": {
"type": "string",
"description": "Name of the hurricane, for example: 'Irma'",
"required": true,
"format": "free"
}
}
},
{
"name": "get_earthquake_info",
"description": "Query the latest earthquake information",
"parameters": {
"magnitude": {
"type": "number",
"description": "The minimum magnitude of the earthquake that needs to be queried.",
"required": false,
"format": "free"
},
"location": {
"type": "string",
"description": "Location of the earthquake, for example: 'California'",
"required": false,
"format": "free"
}
}
}
]
[END OF AVAILABLE TOOLS]
[BEGIN OF FORMAT INSTRUCTION]
Your output should be in the JSON format, which specifies a list of function calls. The example format is as follows. Please make sure the parameter type is correct. If no function call is needed, please make tool_calls an empty list '[]'.
```{"thought": "the thought process, or an empty string", "tool_calls": [{"name": "api_name1", "arguments": {"argument1": "value1", "argument2": "value2"}}]}```
[END OF FORMAT INSTRUCTION]
[BEGIN OF QUERY]
User: Can you give me the latest information on the wildfires occurring in California?
[END OF QUERY]
[BEGIN OF HISTORY STEPS]
[
{
"thought": "Sure, what is the radius (in miles) around the location of the wildfire?",
"tool_calls": [],
"step_id": 1,
"next_observation": "",
"user_input": "User: Let me think... 50 miles."
},
{
"thought": "",
"tool_calls": [
{
"name": "get_fire_info",
"arguments": {
"location": "California",
"radius": 50
}
}
],
"step_id": 2,
"next_observation": [
{
"location": "Los Angeles",
"acres_burned": 1500,
"status": "contained"
},
{
"location": "San Diego",
"acres_burned": 12000,
"status": "active"
}
]
},
{
"thought": "Based on the latest information, there are wildfires in Los Angeles and San Diego. The wildfire in Los Angeles has burned 1,500 acres and is contained, while the wildfire in San Diego has burned 12,000 acres and is still active.",
"tool_calls": [],
"step_id": 3,
"next_observation": "",
"user_input": "User: Can you tell me about the latest earthquake?"
}
]
[END OF HISTORY STEPS]
````
Output:
````json
{"thought": "", "tool_calls": [{"name": "get_earthquake_info", "arguments": {"location": "California"}}]}
````
## License
The model is distributed under the CC-BY-NC-4.0 license.
<!-- ## Citation
If you find this repo helpful, please cite our paper:
```bibtex
``` -->
|