This model has a LASER intervention on Mihaiii/Pallas-0.5-LASER-0.5 .
This was just an experiment. From my testing, Pallas-0.5 is better than this model.
Configs used:
- lnum: 51
- lnames: mlp (meaning: ["mlp.gate_proj.weight", "mlp.up_proj.weight", "mlp.down_proj.weight"])
- rate: 8
- dataset: bigbench (subset: causal_judgement)
- intervention type: rank-reduction
Name | Validation acc (higher is better) | Validation logloss (lower is better) | Test acc (higher is better) | Test logloss (lower is better) |
---|---|---|---|---|
Pallas-0.5 | 55.263 | 1.650 | 60.526 | 1.463 |
Pallas-0.5-LASER-0.1 | 55.263 | 1.639 | 61.184 | 1.451 |
Pallas-0.5-LASER-0.2 | 55.263 | 1.646 | 61.184 | 1.458 |
Pallas-0.5-LASER-0.3 | 55.263 | 1.575 | 61.842 | 1.382 |
Pallas-0.5-LASER-0.4 | 55.263 | 1.525 | 61.842 | 1.326 |
Pallas-0.5-LASER-0.5 | 55.263 | 1.484 | 61.842 | 1.297 |
Pallas-0.5-LASER-0.6 | 55.263 | 1.455 | 61.184 | 1.283 |
In order to replicate on a single A100, you can use my branch (the original code will throw OOM for 34b models).
Prompt Format:
SYSTEM: <ANY SYSTEM CONTEXT>
USER:
ASSISTANT:
- Downloads last month
- 14
Model tree for Mihaiii/Pallas-0.5-LASER-0.6
Base model
Mihaiii/Pallas-0.5
Finetuned
Mihaiii/Pallas-0.5-LASER-0.1
Finetuned
Mihaiii/Pallas-0.5-LASER-0.2
Finetuned
Mihaiii/Pallas-0.5-LASER-0.3
Finetuned
Mihaiii/Pallas-0.5-LASER-0.4
Finetuned
Mihaiii/Pallas-0.5-LASER-0.5