vicgalle
/

CarbonBeagle-11B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

CarbonBeagle-11B

An experiment in merging models of different architectures and sizes. Here are the steps:

Upscale mlabonne/NeuralBeagle14-7B to vicgalle/franken-Beagle-11B.
DPO-tune vicgalle/franken-Beagle-11B to vicgalle/NeuralBeagle-11B.
Merge vicgalle/NeuralBeagle-11B and jeonsworld/CarbonVillain-en-10.7B-v4.

Merge Details

Merge Method

This model was merged using the linear merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
    - model: jeonsworld/CarbonVillain-en-10.7B-v4
      parameters:
        weight: 1.0
    - model: vicgalle/NeuralBeagle-11B
      parameters:
        weight: 0.5
merge_method: linear

dtype: float16

Evaluations

At the time of its creation (21-01-2024), it is the best model in the Open LLM Leaderboard for its size class (10.7B-11B), and also 13B models:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	74.64
AI2 Reasoning Challenge (25-Shot)	71.84
HellaSwag (10-Shot)	88.93
MMLU (5-Shot)	66.62
TruthfulQA (0-shot)	69.43
Winogrande (5-shot)	84.06
GSM8k (5-shot)	66.94

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	22.36
IFEval (0-Shot)	54.15
BBH (3-Shot)	33.06
MATH Lvl 5 (4-Shot)	5.51
GPQA (0-shot)	6.94
MuSR (0-shot)	9.19
MMLU-PRO (5-shot)	25.29

Downloads last month: 8,161

Safetensors

Model size

10.7B params

Tensor type

FP16

·

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for vicgalle/CarbonBeagle-11B

jeonsworld/CarbonVillain-en-10.7B-v4

vicgalle/NeuralBeagle-11B

Merge model

this model

Merges

1 model

Quantizations

Space using vicgalle/CarbonBeagle-11B 1

Collection including vicgalle/CarbonBeagle-11B

Exotic Frankenmerges 🥨

Merges of models of different architectures and sizes that end up working surprisingly well • 1 item • Updated Jun 13 • 1

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

71.840
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

88.930
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

66.620
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

69.430
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

84.060
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

66.940
strict accuracy on IFEval (0-Shot)
Open LLM Leaderboard

54.150
normalized accuracy on BBH (3-Shot)
Open LLM Leaderboard

33.060
exact match on MATH Lvl 5 (4-Shot)
Open LLM Leaderboard

5.510
acc_norm on GPQA (0-shot)
Open LLM Leaderboard

6.940

View on Papers With Code