Edit model card

This model has a LASER intervention on Mihaiii/Pallas-0.5-LASER-0.5 .

This was just an experiment. From my testing, Pallas-0.5 is better than this model.

Configs used:

  • lnum: 51
  • lnames: mlp (meaning: ["mlp.gate_proj.weight", "mlp.up_proj.weight", "mlp.down_proj.weight"])
  • rate: 8
  • dataset: bigbench (subset: causal_judgement)
  • intervention type: rank-reduction
Name Validation acc (higher is better) Validation logloss (lower is better) Test acc (higher is better) Test logloss (lower is better)
Pallas-0.5 55.263 1.650 60.526 1.463
Pallas-0.5-LASER-0.1 55.263 1.639 61.184 1.451
Pallas-0.5-LASER-0.2 55.263 1.646 61.184 1.458
Pallas-0.5-LASER-0.3 55.263 1.575 61.842 1.382
Pallas-0.5-LASER-0.4 55.263 1.525 61.842 1.326
Pallas-0.5-LASER-0.5 55.263 1.484 61.842 1.297
Pallas-0.5-LASER-0.6 55.263 1.455 61.184 1.283

In order to replicate on a single A100, you can use my branch (the original code will throw OOM for 34b models).

Prompt Format:

SYSTEM: <ANY SYSTEM CONTEXT>
USER: 
ASSISTANT:
Downloads last month
14
Safetensors
Model size
34.4B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) has been turned off for this model.

Model tree for Mihaiii/Pallas-0.5-LASER-0.6