robinsmits
commited on
Commit
•
b231245
1
Parent(s):
ec2ffa3
Update README.md
Browse files
README.md
CHANGED
@@ -31,6 +31,54 @@ Finetuning was performed on the Dutch [BramVanroy/alpaca-cleaned-dutch](https://
|
|
31 |
|
32 |
See [DAMO-NLP-MT/polylm-13b](https://huggingface.co/DAMO-NLP-MT/polylm-13b) for all information about the base model.
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
## Intended uses & limitations
|
35 |
|
36 |
The PolyLM-13B model was trained on 18 languages. The primary focus was to create a multi-lingual Open LLM.
|
|
|
31 |
|
32 |
See [DAMO-NLP-MT/polylm-13b](https://huggingface.co/DAMO-NLP-MT/polylm-13b) for all information about the base model.
|
33 |
|
34 |
+
## Model usage
|
35 |
+
|
36 |
+
A basic example of how to use the finetuned model.
|
37 |
+
|
38 |
+
```
|
39 |
+
import torch
|
40 |
+
from peft import AutoPeftModelForCausalLM
|
41 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
42 |
+
|
43 |
+
model_name = "robinsmits/polylm_13b_ft_alpaca_clean_dutch"
|
44 |
+
|
45 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast = False, legacy = False)
|
46 |
+
|
47 |
+
model = AutoPeftModelForCausalLM.from_pretrained(model_name, device_map = "auto", load_in_4bit = True, torch_dtype = torch.bfloat16)
|
48 |
+
|
49 |
+
prompt = "### Instructie:\nWat zijn de drie belangrijkste softwareonderdelen die worden gebruikt bij webontwikkeling?\n\n### Antwoord:\n"
|
50 |
+
|
51 |
+
inputs = tokenizer(prompt, return_tensors = "pt")
|
52 |
+
sample = model.generate(input_ids = inputs.input_ids.cuda(),
|
53 |
+
attention_mask = inputs.attention_mask.cuda(),
|
54 |
+
max_new_tokens = 128,
|
55 |
+
do_sample = True,
|
56 |
+
top_p = 0.85,
|
57 |
+
top_k = 50,
|
58 |
+
temperature = 0.5,
|
59 |
+
repetition_penalty = 1.2,
|
60 |
+
length_penalty = -1.0,
|
61 |
+
num_return_sequences = 1,
|
62 |
+
pad_token_id = tokenizer.eos_token_id,
|
63 |
+
forced_eos_token_id = tokenizer.eos_token_id)
|
64 |
+
output = tokenizer.decode(sample[0], skip_special_tokens = True)
|
65 |
+
|
66 |
+
print(output.split(prompt)[1])
|
67 |
+
```
|
68 |
+
|
69 |
+
The prompt and generated output for the above mentioned example is similar to the output shown below.
|
70 |
+
|
71 |
+
```
|
72 |
+
### Instructie:
|
73 |
+
Wat zijn de drie belangrijkste softwareonderdelen die worden gebruikt bij webontwikkeling?
|
74 |
+
|
75 |
+
### Antwoord:
|
76 |
+
|
77 |
+
De drie belangrijkste softwareonderdelen die worden gebruikt bij webontwikkeling, zijn HTML (HyperText Markup Language), CSS (Cascading Style Sheets) en JavaScript. Deze onderdelen stellen gebruikers in staat om inhoud op een website te creëren of aanpassen met behulp van codering. Bovendien kunnen ze interactieve elementen zoals animatie, video's en audio-opnames toevoegen aan websites. HTML is het meest voorkomende onderdeel omdat deze de basis vormt voor alle andere componenten. Het stelt ontwikkelaars in staat om tekst en afbeeldingen op hun pagina's weer te geven door gebruik te maken van markup tags
|
78 |
+
```
|
79 |
+
|
80 |
+
For more extensive usage and a lot of generated samples (both good and bad samples) see the following [Inference Notebook](https://github.com/RobinSmits/Dutch-LLMs/blob/main/PolyLM_13B_Alpaca_Clean_Dutch_Inference.ipynb)
|
81 |
+
|
82 |
## Intended uses & limitations
|
83 |
|
84 |
The PolyLM-13B model was trained on 18 languages. The primary focus was to create a multi-lingual Open LLM.
|