Edit model card

PlumChat 70b

This is a merge of pre-trained language models created using mergekit.

Merge Details

Shining Valiant 2 + Nemotron for high quality general chat, science-instruct, and complex query performance.

Merge Method

This model was merged using the della merge method using meta-llama/Llama-3.1-70B-Instruct as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

merge_method: della
dtype: bfloat16
parameters:
  normalize: true
models:
  - model: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
    parameters:
      density: 0.5
      weight: 0.3
  - model: ValiantLabs/Llama3.1-70B-ShiningValiant2
    parameters:
      density: 0.5
      weight: 0.25
base_model: meta-llama/Llama-3.1-70B-Instruct

Downloads last month: 94

Safetensors

Model size

70.6B params

Tensor type

BF16

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sequelbox/Llama3.1-70B-PlumChat

ValiantLabs/Llama3.1-70B-ShiningValiant2

meta-llama/Llama-3.1-70B-Instruct

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

Merge model

this model

Quantizations

2 models

Evaluation results

acc on Winogrande (5-Shot)
self-reported

85.000
normalized accuracy on ARC Challenge (25-Shot)
self-reported

67.410
acc on MMLU College Biology (5-Shot)
self-reported

93.750
acc on MMLU High School Biology (5-Shot)
self-reported

91.940
acc on MMLU Conceptual Physics (5-Shot)
self-reported

82.130
acc on MMLU College Physics (5-Shot)
self-reported

60.780
acc on MMLU High School Physics (5-Shot)
self-reported

62.250
acc on MMLU College Chemistry (5-Shot)
self-reported

56.000
acc on MMLU High School Chemistry (5-Shot)
self-reported

73.400
acc on MMLU Astronomy (5-Shot)
self-reported

89.470

View on Papers With Code