QuietImpostor
/

Llama-3-Refueled-Pruned

@@ -16,6 +16,8 @@ language:
 This is a prune of [Llama 3 Refueled](https://www.huggingface.co/refuelai/llama-3-refueled) using [mergekit](https://github.com/cg123/mergekit) and [PruneMe](https://www.github.com/arcee-ai/PruneMe)
 The model is semi-tested, but still needs some debugging, namely with converting to GGUF, though I am working on that.
 ### Performance
 After only 1 test because of lack of compute and for stupid long inference times on my 3060ti (8GB), it does show some interesting results.
 Here's the response after being prompted "Hi!" using the [example from Meta](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3).

 This is a prune of [Llama 3 Refueled](https://www.huggingface.co/refuelai/llama-3-refueled) using [mergekit](https://github.com/cg123/mergekit) and [PruneMe](https://www.github.com/arcee-ai/PruneMe)
 The model is semi-tested, but still needs some debugging, namely with converting to GGUF, though I am working on that.
+Note: the [dataset](https://www.huggingface.co/yahma/alpaca-cleaned) was used for evaluating what layers should be pruned. This model was **NOT** (yet) finetuned.
 ### Performance
 After only 1 test because of lack of compute and for stupid long inference times on my 3060ti (8GB), it does show some interesting results.
 Here's the response after being prompted "Hi!" using the [example from Meta](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3).