RA_REASONER

RA_Reasoner 2.0

Model Details

Developed by: Daemontatox
License: Apache 2.0
Base Model: Daemontatox/RA_Reasoner

This model is fine-tuned from the Falcon-10B-Instruct model, leveraging advanced training optimizations to enhance reasoning and instruction-following capabilities. It was trained 2x faster using Unsloth and Hugging Face's TRL library.


Training Details

  • Frameworks Used: Unsloth, Hugging Face TRL
  • Fine-Tuning Focus: Emphasis on reasoning, logic-based tasks, and instruction comprehension.
  • Dataset: Includes examples from Daemontatox/Deepthinking-COT.
  • Optimization: Significant speedup during fine-tuning while maintaining model quality.

Further details on hyperparameters and fine-tuning methodology will be added in future updates.


Intended Use

This model is intended for research and development in text generation, reasoning tasks, and instruction-following applications.

Key Features:

  • Enhanced reasoning capabilities for multi-step logical problems.
  • Robust instruction-following for complex tasks.
  • Fine-tuned for Chain-of-Thought (COT) reasoning and inference.

Applications:

  • Research on reasoning-based AI systems.
  • Tasks requiring logical deductions, such as question answering and problem-solving.
  • General text generation with a focus on nuanced understanding.

Limitations and Warnings

  • This model is not designed for real-time or production-critical tasks.
  • Outputs may vary based on input specificity and complexity.
  • Users are responsible for ensuring ethical use and compliance with applicable regulations.

Acknowledgments

---# Open LLM Leaderboard Evaluation Results Detailed results can be found here! Summarized results can be found here!

Metric Value (%)
Average 29.00
IFEval (0-Shot) 53.66
BBH (3-Shot) 43.07
MATH Lvl 5 (4-Shot) 22.89
GPQA (0-shot) 9.96
MuSR (0-shot) 7.18
MMLU-PRO (5-shot) 37.26
Downloads last month
69
Safetensors
Model size
10.3B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Daemontatox/RA_Reasoner2.0

Finetuned
(1)
this model
Quantizations
4 models

Dataset used to train Daemontatox/RA_Reasoner2.0

Collection including Daemontatox/RA_Reasoner2.0

Evaluation results