--- license: apache-2.0 datasets: - arcee-ai/EvolKit-20k base_model: - Qwen/Qwen2.5-1.5B --- [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory) # QuantFactory/EVA-D-Qwen2.5-1.5B-v0.0-GGUF This is quantized version of [EVA-UNIT-01/EVA-D-Qwen2.5-1.5B-v0.0](https://huggingface.co/EVA-UNIT-01/EVA-D-Qwen2.5-1.5B-v0.0) created using llama.cpp # Original Model Card # EVA-D Qwen2.5-1.5B v0.0
An experimental online logit distillation of EVA-Qwen2.5-14B-v0.1 into Qwen2.5-1.5B. Should work as a RP/storywriting specialist, but don't expect superb performance from it, due to it's small size. All in all, it was a fun experiment to do.
Note: using quantized KV cache with Qwen2.5 is not recommended and can lead to degraded output quality. On the other hand, Qwen's KV cache is already light enough, so using f16 for it shouldn't be problematic.
Prompt format is ChatML.
Model was trained by Kearm and Auri.