base_model:
- Qwen/QwQ-32B-Preview
tags:
- text-generation-inference
- transformers
- unsloth
- trl
- COT
- Reasoning
- Smart
- Qwen
- QwQ
license: apache-2.0
language:
- en
datasets:
- Daemontatox/LongCOT-Reason
metrics:
- accuracy
- character
library_name: transformers
pipeline_tag: text-generation
model-index:
- name: PathfinderAI
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 37.45
name: strict accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 52.65
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 47.58
name: exact match
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 19.24
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 20.83
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 51.04
name: accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
name: Open LLM Leaderboard
PathfinderAI
- Developed by: Daemontatox
- License: Apache 2.0
- Finetuned Using: Unsloth, Hugging Face Transformers, and TRL Library
Model Overview
The PathfinderAI Model is an advanced AI system optimized for logical reasoning, multi-step problem-solving, and decision-making tasks. Designed with efficiency and accuracy in mind, it employs a structured system prompt to ensure high-quality answers through a transparent and iterative thought process.
System Prompt and Workflow
This model operates using an innovative reasoning framework structured around the following steps:
Initial Thought:
The model uses<Thinking>
tags to reason step-by-step and craft its best possible response.
Example:Self-Critique:
It evaluates its initial response within<Critique>
tags, focusing on:
- Accuracy: Is it factually correct and verifiable?
- Clarity: Is it clear and free of ambiguity?
- Completeness: Does it fully address the request?
- Improvement: What can be enhanced?
Example:
Revision:
Based on the critique, the model refines its response within<Revising>
tags.
Example:Final Response:
The revised response is presented clearly within<Final>
tags.
Example:Tag Innovation:
When needed, the model creates and defines new tags for better structuring or clarity, ensuring consistent usage.
Example:
Key Features
- Structured Reasoning: Transparent, multi-step approach for generating and refining answers.
- Self-Improvement: Built-in critique and revision ensure continuous response enhancement.
- Clarity and Adaptability: Tagging system provides organized, adaptable responses tailored to user needs.
- Creative Flexibility: Supports dynamic problem-solving with the ability to introduce new tags and concepts.
Use Cases
The model is designed for various domains, including:
- Research and Analysis: Extracting insights and providing structured explanations.
- Education: Assisting with tutoring by breaking down complex problems step-by-step.
- Problem-Solving: Offering logical and actionable solutions for multi-step challenges.
- Content Generation: Producing clear, well-organized creative or professional content.
Training Details
Frameworks:
Unsloth for accelerated training.
Hugging Face Transformers and the TRL library for reinforcement learning with human feedback (RLHF).
Dataset: Finetuned on diverse reasoning-focused tasks, including logical puzzles, mathematical problems, and commonsense reasoning scenarios.
Hardware Efficiency:
Trained with bnb-4bit precision for reduced memory usage.
Optimized training pipeline achieving 2x faster development cycles.
Limitations
- Hallucinations Model might hallucinate in very long context problems.
- Unclosed tags As the model gets deep into thinking and reflecting ,it has a tendency to not close thinking or critique tags .
- Tags Compression As the model gets confident in the answer , it will use less and less tags and might have everything in the Tag ,instead of reasoning and going step by step.
- High Resource This Model is Resource intensive and needs a lot of uninterrupted computing , since it's continuously generating tokens to reason , so it might work the best with consumer hardware.
Ethical Considerations
- Transparency: Responses are structured for verifiability through tagging.
- Bias Mitigation: Includes self-critique to minimize biases and ensure fairness.
- Safe Deployment: Users are encouraged to evaluate outputs to prevent harm or misinformation.
License
This model is distributed under the Apache 2.0 license, allowing users to use, modify, and share it in compliance with the license terms.
Acknowledgments
Special thanks to:
- Unsloth for accelerated training workflows.
- Hugging Face for their powerful tools and libraries.
Experience the PathfinderAI l, leveraging its structured reasoning and self-improvement capabilities for any task requiring advanced AI reasoning.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here! Summarized results can be found here!
Metric | % Value |
---|---|
Avg. | 38.13 |
IFEval (0-Shot) | 37.45 |
BBH (3-Shot) | 52.65 |
MATH Lvl 5 (4-Shot) | 47.58 |
GPQA (0-shot) | 19.24 |
MuSR (0-shot) | 20.83 |
MMLU-PRO (5-shot) | 51.04 |