legolasyiu commited on
Commit
3213e45
1 Parent(s): 64e4d5d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md CHANGED
@@ -20,3 +20,87 @@ tags:
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
23
+
24
+ --
25
+ ## Intended Use
26
+
27
+ **Intended Use Cases** Llama 3.1 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.1 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 3.1 Community License allows for these use cases.
28
+
29
+ **Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License. Use in languages beyond those explicitly referenced as supported in this model card**.
30
+ **<span style="text-decoration:underline;">Note</span>: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages. Developers may fine-tune Llama 3.1 models for languages beyond the 8 supported languages provided they comply with the Llama 3.1 Community License and the Acceptable Use Policy and in such cases are responsible for ensuring that any uses of Llama 3.1 in additional languages is done in a safe and responsible manner.
31
+
32
+ ## How to use
33
+
34
+ This repository contains two versions of Meta-Llama-3.1-8B-Instruct, for use with transformers and with the original `llama` codebase.
35
+
36
+ ### Use with transformers
37
+
38
+ Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
39
+
40
+ Make sure to update your transformers installation via `pip install --upgrade transformers`.
41
+
42
+ ```python
43
+ import transformers
44
+ import torch
45
+ model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"
46
+ pipeline = transformers.pipeline(
47
+ "text-generation",
48
+ model=model_id,
49
+ model_kwargs={"torch_dtype": torch.bfloat16},
50
+ device_map="auto",
51
+ )
52
+ messages = [
53
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
54
+ {"role": "user", "content": "Who are you?"},
55
+ ]
56
+ outputs = pipeline(
57
+ messages,
58
+ max_new_tokens=256,
59
+ )
60
+ print(outputs[0]["generated_text"][-1])
61
+ ```
62
+
63
+ Note: You can also find detailed recipes on how to use the model locally, with `torch.compile()`, assisted generations, quantised and more at [`huggingface-llama-recipes`](https://github.com/huggingface/huggingface-llama-recipes)
64
+
65
+ ### Tool use with transformers
66
+
67
+ LLaMA-3.1 supports multiple tool use formats. You can see a full guide to prompt formatting [here](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/).
68
+
69
+ Tool use is also supported through [chat templates](https://huggingface.co/docs/transformers/main/chat_templating#advanced-tool-use--function-calling) in Transformers.
70
+ Here is a quick example showing a single simple tool:
71
+
72
+ ```python
73
+ # First, define a tool
74
+ def get_current_temperature(location: str) -> float:
75
+ """
76
+ Get the current temperature at a location.
77
+
78
+ Args:
79
+ location: The location to get the temperature for, in the format "City, Country"
80
+ Returns:
81
+ The current temperature at the specified location in the specified units, as a float.
82
+ """
83
+ return 22. # A real function should probably actually get the temperature!
84
+ # Next, create a chat and apply the chat template
85
+ messages = [
86
+ {"role": "system", "content": "You are a bot that responds to weather queries."},
87
+ {"role": "user", "content": "Hey, what's the temperature in Paris right now?"}
88
+ ]
89
+ inputs = tokenizer.apply_chat_template(messages, tools=[get_current_temperature], add_generation_prompt=True)
90
+ ```
91
+
92
+ You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so:
93
+
94
+ ```python
95
+ tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France"}}
96
+ messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]})
97
+ ```
98
+
99
+ and then call the tool and append the result, with the `tool` role, like so:
100
+
101
+ ```python
102
+ messages.append({"role": "tool", "name": "get_current_temperature", "content": "22.0"})
103
+ ```
104
+
105
+ After that, you can `generate()` again to let the model use the tool result in the chat. Note that this was a very brief introduction to tool calling - for more information,
106
+ see the [LLaMA prompt format docs](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/) and the Transformers [tool use documentation](https://huggingface.co/docs/transformers/main/chat_templating#advanced-tool-use--function-calling).