π¨ How green is your model? π± Introducing a new feature in the Comparator tool: Environmental Impact for responsible #LLM research! π open-llm-leaderboard/comparator Now, you can not only compare models by performance, but also by their environmental footprint!
π The Comparator calculates COβ emissions during evaluation and shows key model characteristics: evaluation score, number of parameters, architecture, precision, type... π οΈ Make informed decisions about your model's impact on the planet and join the movement towards greener AI!
π New feature of the Comparator of the π€ Open LLM Leaderboard: now compare models with their base versions & derivatives (finetunes, adapters, etc.). Perfect for tracking how adjustments affect performance & seeing innovations in action. Dive deeper into the leaderboard!
π οΈ Here's how to use it: 1. Select your model from the leaderboard. 2. Load its model tree. 3. Choose any base & derived models (adapters, finetunes, merges, quantizations) for comparison. 4. Press Load. See side-by-side performance metrics instantly!
Ready to dive in? π Try the π€ Open LLM Leaderboard Comparator now! See how models stack up against their base versions and derivatives to understand fine-tuning and other adjustments. Easier model analysis for better insights! Check it out here: open-llm-leaderboard/comparator π
Dive into multi-model evaluations, pinpoint the best model for your needs, and explore insights across top open LLMs all in one place. Ready to level up your model comparison game?
π¨ Instruct-tuning impacts models differently across families! Qwen2.5-72B-Instruct excels on IFEval but struggles with MATH-Hard, while Llama-3.1-70B-Instruct avoids MATH performance loss! Why? Can they follow the format in examples? π Compare models: open-llm-leaderboard/comparator