Daemontatox
/

RA_Reasoner

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Uploaded Model

Developed by: Daemontatox

License: Apache 2.0

Finetuned from model: tiiuae/Falcon3-10B-Instruct

This model was fine-tuned from the Falcon-10B-Instruct model. It was trained 2x faster with Unsloth and Hugging Face's TRL library.

This model is intended for text generation tasks, with a focus on reasoning capabilities and instruction following, similar to capabilities demonstrated by the ChatGPT-O1-Mini model.

Training Details

This model was fine-tuned with Unsloth and TRL, resulting in significant speed improvements during the training process. Details on specific fine-tuning data, parameters and methods will be added soon. The fine-tuning process has prioritized improving the model's reasoning abilities on various benchmarks.

Intended Use

This model is intended for research and development purposes related to text generation, instruction following, and complex reasoning tasks. It is suitable for applications that require a model capable of handling multi-step logical problems and understanding nuanced instructions.

Focus on Reasoning: The fine-tuning has been geared towards enhancing the model's ability to tackle reasoning challenges and logic-based tasks.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here! Summarized results can be found here!

Metric	% Value
Avg.	29.02
IFEval (0-Shot)	55.92
BBH (3-Shot)	43.07
MATH Lvl 5 (4-Shot)	20.09
GPQA (0-shot)	10.85
MuSR (0-shot)	7.51
MMLU-PRO (5-shot)	36.67

Downloads last month: 78

Safetensors

Model size

10.3B params

Tensor type

FP16

·

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Daemontatox/RA_Reasoner

Base model

tiiuae/Falcon3-10B-Base

Finetuned

tiiuae/Falcon3-10B-Instruct

Finetuned

(10)

this model

Finetunes

1 model

Merges

Quantizations

Collection including Daemontatox/RA_Reasoner

Reason/COT

12 items • Updated about 13 hours ago • 3

Evaluation results

strict accuracy on IFEval (0-Shot)
Open LLM Leaderboard

55.920
normalized accuracy on BBH (3-Shot)
Open LLM Leaderboard

43.070
exact match on MATH Lvl 5 (4-Shot)
Open LLM Leaderboard

20.090
acc_norm on GPQA (0-shot)
Open LLM Leaderboard

10.850
acc_norm on MuSR (0-shot)
Open LLM Leaderboard

7.510
accuracy on MMLU-PRO (5-shot)
test set Open LLM Leaderboard

36.670

View on Papers With Code