--- base_model: - djuna/Q2.5-Veltha-14B-0.5 - CultriX/Qwenfinity-2.5-14B - qingy2019/Qwen2.5-Math-14B-Instruct - sometimesanotion/Qwen2.5-14B-Vimarckoso - CultriX/Qwen2.5-14B-Wernickev3 - CultriX/Qwen2.5-14B-Broca - CultriX/SeQwence-14Bv1 library_name: transformers tags: - mergekit - merge --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the della_linear merge method using [CultriX/Qwen2.5-14B-Wernickev3](https://huggingface.co/CultriX/Qwen2.5-14B-Wernickev3) as a base. ### Models Merged The following models were included in the merge: * [djuna/Q2.5-Veltha-14B-0.5](https://huggingface.co/djuna/Q2.5-Veltha-14B-0.5) * [CultriX/Qwenfinity-2.5-14B](https://huggingface.co/CultriX/Qwenfinity-2.5-14B) * [qingy2019/Qwen2.5-Math-14B-Instruct](https://huggingface.co/qingy2019/Qwen2.5-Math-14B-Instruct) * [sometimesanotion/Qwen2.5-14B-Vimarckoso](https://huggingface.co/sometimesanotion/Qwen2.5-14B-Vimarckoso) * [CultriX/Qwen2.5-14B-Broca](https://huggingface.co/CultriX/Qwen2.5-14B-Broca) * [CultriX/SeQwence-14Bv1](https://huggingface.co/CultriX/SeQwence-14Bv1) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: della_linear base_model: CultriX/Qwen2.5-14B-Wernickev3 dtype: bfloat16 parameters: epsilon: 0.03 # Refines sharper parameter scaling. lambda: 1.1 # Balances blending while emphasizing significant contributions. normalize: true # Ensures stable parameter integration across models. adaptive_merge_parameters: task_weights: tinyArc: 1.5 # Logical reasoning boost. tinyHellaswag: 1.3 # Contextual and multi-step reasoning. tinyMMLU: 1.2 # Domain-specific knowledge retention. tinyTruthfulQA: 1.6 # Enhanced factual QA tasks. tinyTruthfulQA_mc1: 1.4 tinyWinogrande: 1.5 # Reasoning for multi-turn tasks. IFEval: 1.6 # Instruction-following. BBH: 1.5 # Complex reasoning improvement. MATH: 1.7 # Mathematical reasoning focus. GPQA: 1.6 # Graduate-level QA emphasis. MUSR: 1.6 # Advanced multi-step reasoning. MMLU-PRO: 1.5 # Multitask domain performance. smoothing_factor: 0.15 # Balance model contributions. gradient_clipping: 0.85 # Ensures no single model overly dominates. models: - model: CultriX/Qwen2.5-14B-Wernickev3 parameters: weight: 0.2 # Core multitask foundation. density: 0.7 - model: CultriX/Qwenfinity-2.5-14B parameters: weight: 0.18 # Broad multitask capabilities. density: 0.65 - model: CultriX/Qwen2.5-14B-Broca parameters: weight: 0.15 # Logical reasoning and multitask adaptability. density: 0.6 - model: djuna/Q2.5-Veltha-14B-0.5 parameters: weight: 0.15 # Specialized for MUSR, IFEval, and BBH. density: 0.6 - model: qingy2019/Qwen2.5-Math-14B-Instruct parameters: weight: 0.12 # Mathematical reasoning contributor. density: 0.6 - model: CultriX/SeQwence-14Bv1 parameters: weight: 0.12 # Broad multitask contributor. density: 0.6 - model: sometimesanotion/Qwen2.5-14B-Vimarckoso parameters: weight: 0.08 # Specialist for MUSR and advanced reasoning tasks. density: 0.5 tokenizer_source: CultriX/Qwen2.5-14B-Wernickev3 ```