sometimesanotion/AgoraMix-14B-model_stock-v0.1

AgoraMix is crafted using semi-automated proesses to merge top-performing models.

Note: This model was a response to seeing density parameters on model_stock merges on huggingface, an experiment to see their results. Meanwhile, checking the main branch of mergekit's merge_methods reveals that these parameters are unused. I don't have a fork that would allow that. So, this is a normal model_stock merge, but its performance is still promising.

Ancestor Models

VAGOsolutions/SauerkrautLM-v2-14b-DPO - Solid instruction-following and problem-solving capabilities.
arcee-ai/SuperNova-Medius - Brings a large knowledge base distilled from Llama 3.1 into the Qwen architecture. Middling density and gentle weighting were used to draw on the knowledge pool while keeping behavior more predictable.
CultriX/Qwen2.5-14B-Wernicke - A highly emphasized ancestor merge, because its problem-solving, factuality, and comprehension are exceptional for the model size. Shout out to CultriX whose methods helped inspire this merge.
rombodawg/Rombos-LLM-V2.6-Qwen-14b - Lightly applied to enhance reasoning abilities.
underwoods/medius-erebus-magnum-14b - Subtly incorporated to improve prose quality.

Models Merged

The following YAML configuration was used to produce this model:

merge_method:        model_stock
base_model:          Qwen/Qwen2.5-14B
tokenizer_source:    base
parameters:
  int8_mask:         false
  normalize:         true
  rescale:           false
models:
  - model:           VAGOsolutions/SauerkrautLM-v2-14b-DPO
  - model:           arcee-ai/SuperNova-Medius
  - model:           CultriX/Qwen2.5-14B-Wernicke
  - model:           rombodawg/Rombos-LLM-V2.6-Qwen-14b
  - model:           underwoods/medius-erebus-magnum-14b
dtype:               bfloat16
out_dtype:           bfloat16

sometimesanotion
/

AgoraMix-14B-model_stock-v0.1

Ancestor Models

Models Merged

Model tree for sometimesanotion/AgoraMix-14B-model_stock-v0.1