File size: 12,102 Bytes
4e52ec2 7aba582 4e52ec2 7aba582 4e52ec2 cd87a9d 7aba582 4e52ec2 cd87a9d 4e52ec2 cd87a9d 4e52ec2 cd87a9d 4e52ec2 7aba582 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 |
---
language:
- en
license: apache-2.0
library_name: transformers
tags:
- merge
- mergekit
- lazymergekit
- bfloat16
- roleplay
- creative
- instruct
- anvita
- qwen
- nerd
- homer
- Qandora
base_model:
- bunnycore/Qandora-2.5-7B-Creative
- allknowingroger/HomerSlerp1-7B
- sethuiyer/Qwen2.5-7B-Anvita
- fblgit/cybertron-v4-qw7B-MGS
- jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0
- newsbang/Homer-v0.5-Qwen2.5-7B
pipeline_tag: text-generation
model-index:
- name: Qwen2.5-7B-HomerAnvita-NerdMix
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 77.08
name: strict accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 36.58
name: normalized accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 29.53
name: exact match
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 9.28
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 14.41
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 38.13
name: accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
name: Open LLM Leaderboard
---
# ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
**ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix** is an advanced language model meticulously crafted by merging five pre-trained models using the powerful [mergekit](https://github.com/cg123/mergekit) framework. This fusion leverages the **Model Stock** merge method to combine the creative prowess of **Qandora**, the instructive capabilities of **Qwen-Instruct-Fusion**, the sophisticated blending of **HomerSlerp1**, the mathematical precision of **Cybertron-MGS**, and the uncensored expertise of **Qwen-Nerd**. The resulting model excels in creative text generation, contextual understanding, technical reasoning, and dynamic conversational interactions.
## π Merged Models
This model merge incorporates the following:
- [**bunnycore/Qandora-2.5-7B-Creative**](https://huggingface.co/bunnycore/Qandora-2.5-7B-Creative): Specializes in creative text generation, enhancing the model's ability to produce imaginative and diverse content.
- [**allknowingroger/HomerSlerp1-7B**](https://huggingface.co/allknowingroger/HomerSlerp1-7B): Utilizes spherical linear interpolation (SLERP) to blend model weights smoothly, ensuring a harmonious integration of different model attributes.
- [**sethuiyer/Qwen2.5-7B-Anvita**](https://huggingface.co/sethuiyer/Qwen2.5-7B-Anvita): Focuses on instruction-following capabilities, improving the model's performance in understanding and executing user commands.
- [**fblgit/cybertron-v4-qw7B-MGS**](https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS): Enhances mathematical reasoning and precision, enabling the model to handle complex computational tasks effectively.
- [**jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0**](https://huggingface.co/jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0): Provides uncensored expertise and robust technical knowledge, making the model suitable for specialized technical support and information retrieval.
- [**newsbang/Homer-v0.5-Qwen2.5-7B**](https://huggingface.co/newsbang/Homer-v0.5-Qwen2.5-7B): Acts as the foundational conversational model, providing robust language comprehension and generation capabilities.
## 𧩠Merge Configuration
The configuration below outlines how the models are merged using the **Model Stock** method. This approach ensures a balanced and effective integration of the unique strengths from each source model.
```yaml
# Merge configuration for ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix using Model Stock
models:
- model: bunnycore/Qandora-2.5-7B-Creative
- model: allknowingroger/HomerSlerp1-7B
- model: sethuiyer/Qwen2.5-7B-Anvita
- model: fblgit/cybertron-v4-qw7B-MGS
- model: jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0
merge_method: model_stock
base_model: newsbang/Homer-v0.5-Qwen2.5-7B
normalize: false
int8_mask: true
dtype: bfloat16
```
### Key Parameters
- **Merge Method (`merge_method`):** Utilizes the **Model Stock** method, as described in [Model Stock](https://arxiv.org/abs/2403.19522), to effectively combine multiple models by leveraging their strengths.
- **Models (`models`):** Specifies the list of models to be merged:
- **bunnycore/Qandora-2.5-7B-Creative:** Enhances creative text generation.
- **allknowingroger/HomerSlerp1-7B:** Facilitates smooth blending of model weights using SLERP.
- **sethuiyer/Qwen2.5-7B-Anvita:** Improves instruction-following capabilities.
- **fblgit/cybertron-v4-qw7B-MGS:** Enhances mathematical reasoning and precision.
- **jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0:** Provides uncensored technical expertise.
- **Base Model (`base_model`):** Defines the foundational model for the merge, which is **newsbang/Homer-v0.5-Qwen2.5-7B** in this case.
- **Normalization (`normalize`):** Set to `false` to retain the original scaling of the model weights during the merge.
- **INT8 Mask (`int8_mask`):** Enabled (`true`) to apply INT8 quantization masking, optimizing the model for efficient inference without significant loss in precision.
- **Data Type (`dtype`):** Uses `bfloat16` to maintain computational efficiency while ensuring high precision.
## π Performance Highlights
- **Creative Text Generation:** Enhanced ability to produce imaginative and diverse content suitable for creative writing, storytelling, and content creation.
- **Instruction Following:** Improved performance in understanding and executing user instructions, making the model more responsive and accurate in task execution.
- **Mathematical Reasoning:** Enhanced capability to handle complex computational tasks with high precision, suitable for technical and analytical applications.
- **Uncensored Technical Expertise:** Provides robust technical knowledge without content restrictions, making it ideal for specialized technical support and information retrieval.
- **Optimized Inference:** INT8 masking and `bfloat16` data type contribute to efficient computation, enabling faster response times without compromising quality.
## π― Use Case & Applications
**ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix** is designed to excel in environments that demand a combination of creative generation, precise instruction following, mathematical reasoning, and technical expertise. Ideal applications include:
- **Creative Writing Assistance:** Aiding authors and content creators in generating imaginative narratives, dialogues, and descriptive text.
- **Interactive Storytelling and Role-Playing:** Enhancing dynamic and engaging interactions in role-playing games and interactive storytelling platforms.
- **Educational Tools and Tutoring Systems:** Providing detailed explanations, answering questions, and assisting in educational content creation with contextual understanding.
- **Technical Support and Customer Service:** Offering accurate and contextually relevant responses in technical support scenarios, improving user satisfaction.
- **Content Generation for Marketing:** Creating compelling and diverse marketing copy, social media posts, and promotional material with creative flair.
- **Mathematical Problem Solving:** Assisting in solving complex mathematical problems and providing step-by-step explanations for educational purposes.
- **Technical Documentation and Analysis:** Generating detailed technical documents, reports, and analyses with high precision and clarity.
## π Usage
To utilize **ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix**, follow the steps below:
### Installation
First, install the necessary libraries:
```bash
pip install -qU transformers accelerate
```
### Example Code
Below is an example of how to load and use the model for text generation:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
# Define the model name
model_name = "ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix"
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Load the model
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Initialize the pipeline
text_generator = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Define the input prompt
prompt = "Explain the significance of artificial intelligence in modern healthcare."
# Generate the output
outputs = text_generator(
prompt,
max_new_tokens=150,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
# Print the generated text
print(outputs[0]["generated_text"])
```
### Notes
- **Fine-Tuning:** This merged model may require fine-tuning to optimize performance for specific applications or domains.
- **Resource Requirements:** Ensure that your environment has sufficient computational resources, especially GPU-enabled hardware, to handle the model efficiently during inference.
- **Customization:** Users can adjust parameters such as `temperature`, `top_k`, and `top_p` to control the creativity and diversity of the generated text.
## π License
This model is open-sourced under the **Apache-2.0 License**.
## π‘ Tags
- `merge`
- `mergekit`
- `model_stock`
- `Qwen`
- `Homer`
- `Anvita`
- `Nerd`
- `ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix`
- `bunnycore/Qandora-2.5-7B-Creative`
- `allknowingroger/HomerSlerp1-7B`
- `sethuiyer/Qwen2.5-7B-Anvita`
- `fblgit/cybertron-v4-qw7B-MGS`
- `jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0`
- `newsbang/Homer-v0.5-Qwen2.5-7B`
---
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ZeroXClem__Qwen2.5-7B-HomerAnvita-NerdMix)
| Metric |Value|
|-------------------|----:|
|Avg. |34.17|
|IFEval (0-Shot) |77.08|
|BBH (3-Shot) |36.58|
|MATH Lvl 5 (4-Shot)|29.53|
|GPQA (0-shot) | 9.28|
|MuSR (0-shot) |14.41|
|MMLU-PRO (5-shot) |38.13|
|