Uploaded Model
Developed by: Daemontatox
License: Apache 2.0
Finetuned from model: tiiuae/Falcon3-10B-Instruct
This model was fine-tuned from the Falcon-10B-Instruct model. It was trained 2x faster with Unsloth and Hugging Face's TRL library.
This model is intended for text generation tasks, with a focus on reasoning capabilities and instruction following, similar to capabilities demonstrated by the ChatGPT-O1-Mini model.
Training Details
This model was fine-tuned with Unsloth and TRL, resulting in significant speed improvements during the training process. Details on specific fine-tuning data, parameters and methods will be added soon. The fine-tuning process has prioritized improving the model's reasoning abilities on various benchmarks.
Intended Use
This model is intended for research and development purposes related to text generation, instruction following, and complex reasoning tasks. It is suitable for applications that require a model capable of handling multi-step logical problems and understanding nuanced instructions.
Focus on Reasoning: The fine-tuning has been geared towards enhancing the model's ability to tackle reasoning challenges and logic-based tasks.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here! Summarized results can be found here!
Metric | % Value |
---|---|
Avg. | 29.02 |
IFEval (0-Shot) | 55.92 |
BBH (3-Shot) | 43.07 |
MATH Lvl 5 (4-Shot) | 20.09 |
GPQA (0-shot) | 10.85 |
MuSR (0-shot) | 7.51 |
MMLU-PRO (5-shot) | 36.67 |
- Downloads last month
- 78
Model tree for Daemontatox/RA_Reasoner
Base model
tiiuae/Falcon3-10B-BaseCollection including Daemontatox/RA_Reasoner
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard55.920
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard43.070
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard20.090
- acc_norm on GPQA (0-shot)Open LLM Leaderboard10.850
- acc_norm on MuSR (0-shot)Open LLM Leaderboard7.510
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard36.670