mlabonne commited on
Commit
a81f0ef
β€’
1 Parent(s): 5c86e8a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - mlabonne/guanaco-llama2-1k
5
+ pipeline_tag: text-generation
6
+ ---
7
+ # πŸ¦™πŸ§  Miniguanaco-13b
8
+
9
+ πŸ“ [Article](https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32) |
10
+ πŸ’» [Colab](https://colab.research.google.com/drive/1PEQyJO1-f6j0S_XJ8DV50NkpzasXkrzd?usp=sharing) |
11
+ πŸ“„ [Script](https://gist.github.com/mlabonne/b5718e1b229ce6553564e3f56df72c5c)
12
+
13
+ <center><img src="https://i.imgur.com/1IZmjU4.png" width="300"></center>
14
+
15
+ This is a `Llama-2-13b-chat-hf` model fine-tuned using QLoRA (4-bit precision) on the [`mlabonne/guanaco-llama2-1k`](https://huggingface.co/datasets/mlabonne/guanaco-llama2-1k) dataset, which is a subset of the [`timdettmers/openassistant-guanaco`](https://huggingface.co/datasets/timdettmers/openassistant-guanaco).
16
+
17
+ ## πŸ”§ Training
18
+
19
+ It was trained on an RTX 3090. It is mainly designed for educational purposes, not for inference. Parameters:
20
+
21
+ ```
22
+ max_seq_length = 2048
23
+ use_nested_quant = True
24
+ bnb_4bit_compute_dtype=bfloat16
25
+ lora_r=8
26
+ lora_alpha=16
27
+ lora_dropout=0.05
28
+ per_device_train_batch_size=2
29
+ ```
30
+
31
+ ## πŸ’» Usage
32
+
33
+ ``` python
34
+ # pip install transformers accelerate
35
+
36
+ from transformers import AutoTokenizer
37
+ import transformers
38
+ import torch
39
+
40
+ model = "mlabonne/llama-2-13b-miniguanaco"
41
+ prompt = "What is a large language model?"
42
+
43
+ tokenizer = AutoTokenizer.from_pretrained(model)
44
+ pipeline = transformers.pipeline(
45
+ "text-generation",
46
+ model=model,
47
+ torch_dtype=torch.float16,
48
+ device_map="auto",
49
+ )
50
+
51
+ sequences = pipeline(
52
+ f'<s>[INST] {prompt} [/INST]',
53
+ do_sample=True,
54
+ top_k=10,
55
+ num_return_sequences=1,
56
+ eos_token_id=tokenizer.eos_token_id,
57
+ max_length=200,
58
+ )
59
+ for seq in sequences:
60
+ print(f"Result: {seq['generated_text']}")
61
+ ```