Edit model card

cute

MN-Tiramisu-12B

This is a really yappity-yappy yapping model that's good for long-form RP. Tried to rein it in with Mahou and give it some more character understanding with Pantheon. Feedback is always welcome.

Native Context Length: 16K/16384 (can be extended using RoPE, YMMV)

Prompt Template: ChatML

<|im_start|>system
{system prompt}<|im_end|>
<|im_start|>user
{message}<|im_end|>
<|im_start|>assistant
{response}

Recommended Settings:

Here are some settings ranges that tend to work for me. They aren't strict values, and there's a bit of leeway in them. Feel free to experiment a bit!

  • Temperature: 1.0 (maybe less, a little bit goes a long way with Nemo)
  • Min-P: 0.1 to 0.2
  • (all other samplers disabled)

Merge Details

This is a merge of pre-trained language models created using mergekit.

Merge Method

This model was merged using the linear DARE merge method using flammenai/Mahou-1.3-mistral-nemo-12B as a base.

Models Merged

The following models were included in the merge:

  • nbeerbower/mistral-nemo-gutenberg-12B-v4
  • Sao10K/MN-12B-Lyra-v1
  • Gryphe/Pantheon-RP-1.5-12b-Nemo
  • flammenai/Mahou-1.3-mistral-nemo-12B

Configuration

The following YAML configuration was used to produce this model:

base_model: flammenai/Mahou-1.3-mistral-nemo-12B
dtype: bfloat16
merge_method: dare_linear
slices:
- sources:
  - layer_range: [0, 40]
    model: Gryphe/Pantheon-RP-1.5-12b-Nemo
    parameters:
      weight: [0.45, 0.35, 0.35, 0.2, 0.2]
  - layer_range: [0, 40]
    model: Sao10K/MN-12B-Lyra-v1
    parameters:
      weight: [0.25, 0.3, 0.35, 0.3, 0.2]
  - layer_range: [0, 40]
    model: nbeerbower/mistral-nemo-gutenberg-12B-v4
    parameters:
      weight:
      - filter: mlp
        value: [0.1, 0.2, 0.1, 0.4, 0.5]
      - value: [0.1, 0.2, 0.1, 0.2, 0.2]
  - layer_range: [0, 40]
    model: flammenai/Mahou-1.3-mistral-nemo-12B
    parameters:
      weight:
      - filter: mlp
        value: [0.2, 0.15, 0.2, 0.1, 0.1]
      - value: [0.2, 0.15, 0.2, 0.3, 0.4]
tokenizer_source: union

Benchmarks (or Benchmark because I tried only one)

I ran EQ bench from EleutherAI's lm-evaluation-harness (thank you @FallenMerick).

| Tasks  |Version|Filter|n-shot|     Metric      |   | Value  |   |Stderr|
|--------|------:|------|-----:|-----------------|---|-------:|---|-----:|
|eq_bench|    2.1|none  |     0|eqbench          |↑  | 79.3617|±  | 1.637|
|        |       |none  |     0|percent_parseable|↑  |100.0000|±  | 0.000|

And as always, have a great day!

Downloads last month
22
Safetensors
Model size
12.2B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for matchaaaaa/MN-Tiramisu-12B

Collection including matchaaaaa/MN-Tiramisu-12B