gagan3012 commited on
Commit
2eb6621
1 Parent(s): 2960d00

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - moe
5
+ - mergekit
6
+ - merge
7
+ - microsoft/phi-2
8
+ - microsoft/phi-2
9
+ - microsoft/phi-2
10
+ - microsoft/phi-2
11
+ ---
12
+
13
+ # MetaModel_moe_small
14
+
15
+ This model is a Mixure of Experts (MoE) made with [mergekit](https://github.com/cg123/mergekit) (mixtral branch). It uses the following base models:
16
+ * [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
17
+ * [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
18
+ * [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
19
+ * [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
20
+
21
+ ## 🧩 Configuration
22
+
23
+ ```yaml
24
+ base_model: microsoft/phi-2
25
+ gate_mode: hidden
26
+ dtype: bfloat16
27
+ experts:
28
+ - source_model: microsoft/phi-2
29
+ positive_prompts: [""]
30
+ - source_model: microsoft/phi-2
31
+ positive_prompts: [""]
32
+ - source_model: microsoft/phi-2
33
+ positive_prompts: [""]
34
+ - source_model: microsoft/phi-2
35
+ positive_prompts: [""]
36
+ ```
37
+
38
+ ## 💻 Usage
39
+
40
+ ```python
41
+ !pip install -qU transformers bitsandbytes accelerate
42
+
43
+ from transformers import AutoTokenizer
44
+ import transformers
45
+ import torch
46
+
47
+ model = "gagan3012/MetaModel_moe_small"
48
+
49
+ tokenizer = AutoTokenizer.from_pretrained(model)
50
+ pipeline = transformers.pipeline(
51
+ "text-generation",
52
+ model=model,
53
+ model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
54
+ )
55
+
56
+ messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
57
+ prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
58
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
59
+ print(outputs[0]["generated_text"])
60
+ ```