paulilioaica
/

PhiMiX-2x2B

@@ -13,23 +13,38 @@ base_model:
 - rhysjones/phi-2-orange
 ---
-# PhiMiX-2x2B_embed
-PhiMiX-2x2B_embed is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
 * [cognitivecomputations/dolphin-2_6-phi-2](https://huggingface.co/cognitivecomputations/dolphin-2_6-phi-2)
 * [rhysjones/phi-2-orange](https://huggingface.co/rhysjones/phi-2-orange)
 ## 🧩 Configuration
 ```yaml
 base_model: rhysjones/phi-2-orange
-gate_mode: cheap_embed
 dtype: float16
 experts:
   - source_model: cognitivecomputations/dolphin-2_6-phi-2
-    positive_prompts: ["research, logic, math, science"]
   - source_model: rhysjones/phi-2-orange
-    positive_prompts: ["programming, reasoning"]
 ```
 ## 💻 Usage
@@ -47,11 +62,13 @@ tokenizer = AutoTokenizer.from_pretrained(model)
 pipeline = transformers.pipeline(
     "text-generation",
     model=model,
-    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
 )
-messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
-prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
-outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
 print(outputs[0]["generated_text"])
 ```

 - rhysjones/phi-2-orange
 ---
+# PhiMiX-2x2B
+## Code is work in progress
+<p align="center">
+<img src="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11201acc-4089-416d-921b-cbd71fbf8ddb_1024x1024.jpeg" width="500" class="center"/>
+</p>
+PhiMiX-2x2B is a Mixure of Experts (MoE) made with the following models using mergekit:
 * [cognitivecomputations/dolphin-2_6-phi-2](https://huggingface.co/cognitivecomputations/dolphin-2_6-phi-2)
 * [rhysjones/phi-2-orange](https://huggingface.co/rhysjones/phi-2-orange)
+## ©️ Credits
+* [mlabonne's phixtral](https://huggingface.co/mlabonne/phixtral-4x2_8) for the PhiConfig and inference code.
+* [mergekit](https://github.com/cg123/mergekit) code which I tweaked (you can find the PhiConfig [here](https://github.com/cg123/mergekit/blob/508348ae34be17ea0a95d0a288a6e34491a2558a/mergekit/architecture.py#L289))
+by mainly adding the config in the `moe_mixtral.py` script from `mixtral` branch.
 ## 🧩 Configuration
 ```yaml
 base_model: rhysjones/phi-2-orange
+gate_mode: random
 dtype: float16
 experts:
   - source_model: cognitivecomputations/dolphin-2_6-phi-2
+    positive_prompts: [""]
   - source_model: rhysjones/phi-2-orange
+    positive_prompts: [""]
 ```
 ## 💻 Usage
 pipeline = transformers.pipeline(
     "text-generation",
     model=model,
+    trust_remote_code=True,
+    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True,},
 )
+prompt="How many continents are there?"
+input = f"Instruct: <prompt>\nOutput:"
+outputs = pipeline(input, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
 print(outputs[0]["generated_text"])
 ```