Would love a 70B version
I mostly ran this as fp16 since it fit on my GPU and I found it to be better than just about all other 8B models, only to be outdone by models in the 20B neighborhood.
It handles tool calling much better than L3.1, and advanced system prompts like using 'thinking', 'reflection', and 'output' tags from the Reflection-Llama-3.1- 70B suggested system prompt.
https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B
I can run Llama-3.1-70B IQ4 (40Gb) split across my CPU and GPU and get 1.4 tokens/sec which is slow but worth it when I need it, and I'd love to compare it to a 70B Supernova model. Maybe think about incorporating the tag system from Reflection during fine tuning as well.
Great work, looking forward to the next release.
Hi,
Thank you for the nice feedback. The 70B version is SuperNova, which is a commercial model: https://blog.arcee.ai/meet-arcee-supernova-our-flagship-70b-model-alternative-to-openai/
I would recommend Nova, our previous 72B model (https://huggingface.co/arcee-ai/Arcee-Nova), which is one of the best 70B+ models available today.