Post
1638
π Introducing RefuelLLM-2 and RefuelLLM-2-small, the next version of our large language models purpose built for data labeling, enrichment and cleaning.
RefuelLLM-2 (83.82%) outperforms all state-of-the-art LLMs, including GPT-4-Turbo (80.88%), Claude-3-Opus (79.19%) and Gemini-1.5-Pro (74.59%), across a benchmark of ~30 data labeling tasks.
RefuelLLM-2-small (79.67%), aka Llama-3-Refueled, outperforms all comparable LLMs including Claude3-Sonnet (70.99%), Haiku (69.23%) and GPT-3.5-Turbo (68.13%).
π Open sourcing the model weights: refuelai/Llama-3-Refueled
π Detailed blog post: https://www.refuel.ai/blog-posts/announcing-refuel-llm-2
π§ͺ Try out the model here: https://labs.refuel.ai/playground
RefuelLLM-2 (83.82%) outperforms all state-of-the-art LLMs, including GPT-4-Turbo (80.88%), Claude-3-Opus (79.19%) and Gemini-1.5-Pro (74.59%), across a benchmark of ~30 data labeling tasks.
RefuelLLM-2-small (79.67%), aka Llama-3-Refueled, outperforms all comparable LLMs including Claude3-Sonnet (70.99%), Haiku (69.23%) and GPT-3.5-Turbo (68.13%).
π Open sourcing the model weights: refuelai/Llama-3-Refueled
π Detailed blog post: https://www.refuel.ai/blog-posts/announcing-refuel-llm-2
π§ͺ Try out the model here: https://labs.refuel.ai/playground