herisan commited on
Commit
9591032
1 Parent(s): 1c3fe78

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -22
README.md CHANGED
@@ -1,28 +1,56 @@
1
- ---
2
- language:
3
- - en
4
- license: apache-2.0
5
- tags:
6
- - text-generation-inference
7
- - transformers
8
- - unsloth
9
- - llama
10
- - trl
11
- - sft
12
- base_model: unsloth/llama-3-8b-bnb-4bit
13
- ---
14
-
15
- # Usage
16
-
17
  !pip -q install git+https://github.com/huggingface/transformers # need to install from github
18
  !pip -q install bitsandbytes accelerate xformers einops
19
 
20
- # Uploaded model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
- - **Developed by:** herisan
23
- - **License:** apache-2.0
24
- - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit
25
 
26
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  !pip -q install git+https://github.com/huggingface/transformers # need to install from github
2
  !pip -q install bitsandbytes accelerate xformers einops
3
 
4
+ import os
5
+ import torch
6
+ import transformers
7
+ from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline
8
+
9
+ model_name = "herisan/llama-3-8b_mental_health_counseling_conversations"
10
+
11
+ # use the commented out parts for running in 4bit
12
+ bnb_config = BitsAndBytesConfig(
13
+ load_in_4bit=True,
14
+ bnb_4bit_use_double_quant=True,
15
+ bnb_4bit_quant_type="nf4",
16
+ bnb_4bit_compute_dtype=torch.bfloat16
17
+ )
18
+
19
+
20
+ model = AutoModelForCausalLM.from_pretrained(
21
+ model_name,
22
+ torch_dtype=torch.bfloat16,
23
+ quantization_config=bnb_config,
24
+ # low_cpu_mem_usage=True
25
+ )
26
+
27
+
28
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
29
+ tokenizer.bos_token_id = 1
30
 
31
+ stop_token_ids = [0]
 
 
32
 
33
+ pipe = pipeline(
34
+ "text-generation",
35
+ model=model,
36
+ tokenizer=tokenizer,
37
+ use_cache=True,
38
+ device_map="auto",
39
+ max_length=2046,
40
+ do_sample=True,
41
+ top_k=5,
42
+ num_return_sequences=1,
43
+ eos_token_id=tokenizer.eos_token_id,
44
+ pad_token_id=tokenizer.eos_token_id,
45
+ )
46
 
47
+ messages = [
48
+ {
49
+ "role": "system",
50
+ "content": "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.",
51
+ },
52
+ {"role": "user", "content": "I'm going through some things with my feelings and myself. I barely sleep and I do nothing but think about how I'm worthless and how I shouldn't be here. I've never tried or contemplated suicide. I've always wanted to fix my issues, but I never get around to it. How can I change my feeling of being worthless to everyone?"},
53
+ ]
54
+ prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
55
+ outputs = pipe(prompt, max_new_tokens=2046, do_sample=True, temperature=0.7, top_k=50, top_p=0.95, truncation=True)
56
+ print(outputs[0]["generated_text"])