Pruning Details
This is a prune of Llama 3 Refueled using mergekit and PruneMe The model is semi-tested, but still needs some debugging, namely with converting to GGUF, though I am working on that.
Note: the dataset was used for evaluating what layers should be pruned. This model was NOT finetuned.
Performance
After only 1 test because of lack of compute and for stupid long inference times on my 3060ti (8GB), it does show some interesting results. Here's the response after being prompted "Hi!" using the example from Meta.
vel tips and recommendations.user
Hi!assistant
Hi! I can help you find the best travel tips and recommendations for your next trip. Where you most interested to travel and what kind of activities you most to to the 9e sure, we can start and letiing 10e 11e 12e 13e 14e 15e 16e 17e 18e 19e 20e 21e 23e 24e 5e 6e 7e 8e 9e 10e 11e 12e 13e 14e 15e
Even without finetuning, the model still exhibits some extent of instruction following. And fine-tuning is a WIP and I will update this when it's ready. Finetuning is no longer in progress due to issues with unsloth. However, I am working on a project that will hopefully make pruning models easier.
Configuration
The following YAML configuration was used to produce this model:
slices:
- sources:
- model: refuelai/Llama-3-Refueled
layer_range: [0, 19]
- sources:
- model: refuelai/Llama-3-Refueled
layer_range: [29, 32]
merge_method: passthrough
dtype: bfloat16
- Downloads last month
- 5