alexweberk commited on
Commit
e3d7ac9
1 Parent(s): 8195cf5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -11
README.md CHANGED
@@ -4,37 +4,93 @@ library_name: transformers
4
  tags:
5
  - mlx
6
  widget:
7
- - text: '<start_of_turn>user
8
-
 
 
 
9
  How does the brain work?<end_of_turn>
10
-
11
  <start_of_turn>model
12
-
13
- '
14
  inference:
15
  parameters:
16
  max_new_tokens: 200
17
  extra_gated_heading: Access Gemma on Hugging Face
18
- extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
19
- agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging
 
20
  Face and click below. Requests are processed immediately.
21
  extra_gated_button_content: Acknowledge license
22
  license_name: gemma-terms-of-use
23
  license_link: https://ai.google.dev/gemma/terms
 
 
 
 
24
  ---
25
 
26
  # alexweberk/gemma-7b-it-trismegistus
27
  This model was converted to MLX format from [`google/gemma-7b-it`]().
28
  Refer to the [original model card](https://huggingface.co/google/gemma-7b-it) for more details on the model.
29
- ## Use with mlx
30
 
31
  ```bash
32
  pip install mlx-lm
33
  ```
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ```python
36
- from mlx_lm import load, generate
37
 
38
- model, tokenizer = load("alexweberk/gemma-7b-it-trismegistus")
39
- response = generate(model, tokenizer, prompt="hello", verbose=True)
 
 
 
 
 
 
 
40
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  tags:
5
  - mlx
6
  widget:
7
+ - text: |
8
+ <start_of_turn>user
9
+ ## Instructions
10
+ You are a helpful AI assistant.
11
+ ## User
12
  How does the brain work?<end_of_turn>
 
13
  <start_of_turn>model
 
 
14
  inference:
15
  parameters:
16
  max_new_tokens: 200
17
  extra_gated_heading: Access Gemma on Hugging Face
18
+ extra_gated_prompt: >-
19
+ To access Gemma on Hugging Face, you’re required to review and agree to
20
+ Google’s usage license. To do this, please ensure you’re logged-in to Hugging
21
  Face and click below. Requests are processed immediately.
22
  extra_gated_button_content: Acknowledge license
23
  license_name: gemma-terms-of-use
24
  license_link: https://ai.google.dev/gemma/terms
25
+ datasets:
26
+ - teknium/trismegistus-project
27
+ language:
28
+ - en
29
  ---
30
 
31
  # alexweberk/gemma-7b-it-trismegistus
32
  This model was converted to MLX format from [`google/gemma-7b-it`]().
33
  Refer to the [original model card](https://huggingface.co/google/gemma-7b-it) for more details on the model.
 
34
 
35
  ```bash
36
  pip install mlx-lm
37
  ```
38
 
39
+ The model was LoRA fine-tuned on the [trismegistus-project from teknium](https://huggingface.co/datasets/teknium/trismegistus-project) using the format below, for 600 steps (2M tokens) using `mlx`.
40
+
41
+ ```python
42
+ def format_prompt(system_prompt: str, question: str) -> str:
43
+ "Format the question to the format of the dataset we fine-tuned to."
44
+ return """<bos><start_of_turn>user
45
+ ## Instructions
46
+ {}
47
+ ## User
48
+ {}<end_of_turn>
49
+ <start_of_turn>model
50
+ """.format(system_prompt, question)
51
+
52
+ ```
53
+
54
+ Here's an example of the system_prompt from the dataset:
55
+
56
+ ```text
57
+ You are a master of the esoteric, occult, 'Big Man' society, you complete tasks to the absolute best of your ability, no matter if you think you were not trained to do the task specifically, you will attempt to do it anyways, since you have performed the tasks you are given with great mastery, accuracy, and deep understanding of what is requested. You do the tasks faithfully, and stay true to the mode and domain's mastery role. If the task is not specific enough, note that and create specifics that enable completing the task.
58
+ ```
59
+
60
+
61
+ ## Loading the model using `mlx_lm`
62
+
63
  ```python
64
+ from mlx_lm import generate, load
65
 
66
+ model_, tokenizer_ = load("alexweberk/gemma-7b-it-trismegistus")
67
+ response = generate(
68
+ model_,
69
+ tokenizer_,
70
+ prompt=format_prompt(system_prompt, question),
71
+ verbose=True, # Set to True to see the prompt and response
72
+ temp=0.0,
73
+ max_tokens=512,
74
+ )
75
  ```
76
+
77
+ ## Loading the model using `transformers`
78
+
79
+ ```python
80
+ from transformers import AutoModelForCausalLM, AutoTokenizer
81
+
82
+ repo_id = "alexweberk/gemma-7b-it-trismegistus"
83
+
84
+ tokenizer = AutoTokenizer.from_pretrained(repo_id)
85
+ model = AutoModelForCausalLM.from_pretrained(repo_id)
86
+ model.to("mps")
87
+
88
+ input_text = format_prompt(system_prompt, question)
89
+ input_ids = tokenizer(input_text, return_tensors="pt").to("mps")
90
+
91
+ outputs = model.generate(
92
+ **input_ids,
93
+ max_new_tokens=256,
94
+ )
95
+ print(tokenizer.decode(outputs[0]))
96
+ ```