Llama
Collection
4 items
β’
Updated
β’
2
This is a fine-tuned version of Llama-3.2-3B-Instruct-abliterated
If the desired result is not achieved, you can clear the conversation and try again.
The following data has been re-evaluated and calculated as the average for each test.
Benchmark | Llama-3.2-3B-Instruct | Llama-3.2-3B-Instruct-abliterated | Llama-3.2-3B-Instruct-abliterated-finetuned |
---|---|---|---|
IF_Eval | 76.55 | 76.76 | 77.80 |
MMLU Pro | 27.88 | 28.00 | 27.63 |
TruthfulQA | 50.55 | 50.73 | 49.83 |
BBH | 41.81 | 41.86 | 42.20 |
GPQA | 28.39 | 28.41 | 28.74 |
The script used for evaluation can be found inside this repository under /eval.sh, or click here
Base model
meta-llama/Llama-3.2-3B-Instruct