File size: 7,958 Bytes
a80086a
599585d
faf9acb
a80086a
 
 
 
 
599585d
 
1c2a173
 
 
a80086a
 
 
599585d
 
 
 
 
 
 
9b280a4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a80086a
 
599585d
a80086a
faf9acb
a80086a
599585d
 
 
a80086a
599585d
 
faf9acb
599585d
 
0965a57
599585d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bfd13b4
599585d
bfd13b4
 
 
 
599585d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9b280a4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
---
base_model:
- Qwen/QwQ-32B-Preview
tags:
- text-generation-inference
- transformers
- unsloth
- trl
- COT
- Reasoning
- Smart
- Qwen
- QwQ
license: apache-2.0
language:
- en
datasets:
- Daemontatox/LongCOT-Reason
metrics:
- accuracy
- character
library_name: transformers
pipeline_tag: text-generation
model-index:
- name: PathfinderAI
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: IFEval (0-Shot)
      type: HuggingFaceH4/ifeval
      args:
        num_few_shot: 0
    metrics:
    - type: inst_level_strict_acc and prompt_level_strict_acc
      value: 37.45
      name: strict accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: BBH (3-Shot)
      type: BBH
      args:
        num_few_shot: 3
    metrics:
    - type: acc_norm
      value: 52.65
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MATH Lvl 5 (4-Shot)
      type: hendrycks/competition_math
      args:
        num_few_shot: 4
    metrics:
    - type: exact_match
      value: 47.58
      name: exact match
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GPQA (0-shot)
      type: Idavidrein/gpqa
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 19.24
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MuSR (0-shot)
      type: TAUR-Lab/MuSR
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 20.83
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU-PRO (5-shot)
      type: TIGER-Lab/MMLU-Pro
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 51.04
      name: accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
---

![image](./image.webp)

# PathfinderAI   

- **Developed by:** Daemontatox  
- **License:** Apache 2.0   
- **Finetuned Using:** [Unsloth](https://github.com/unslothai/unsloth), Hugging Face Transformers, and TRL Library  

## Model Overview  

The **PathfinderAI  Model** is an advanced AI system optimized for logical reasoning, multi-step problem-solving, and decision-making tasks. Designed with efficiency and accuracy in mind, it employs a structured system prompt to ensure high-quality answers through a transparent and iterative thought process.  

### System Prompt and Workflow  
 
This model operates using an innovative reasoning framework structured around the following steps:  

1. **Initial Thought:**  
   The model uses `<Thinking>` tags to reason step-by-step and craft its best possible response.  
   Example:  

2. **Self-Critique:**  
It evaluates its initial response within `<Critique>` tags, focusing on:  
- **Accuracy:** Is it factually correct and verifiable?  
- **Clarity:** Is it clear and free of ambiguity?  
- **Completeness:** Does it fully address the request?  
- **Improvement:** What can be enhanced?  
Example:  

3. **Revision:**  
Based on the critique, the model refines its response within `<Revising>` tags.  
Example:  

4. **Final Response:**  
The revised response is presented clearly within `<Final>` tags.  
Example:  

5. **Tag Innovation:**  
When needed, the model creates and defines new tags for better structuring or clarity, ensuring consistent usage.  
Example:  

### Key Features  
- **Structured Reasoning:** Transparent, multi-step approach for generating and refining answers.  
- **Self-Improvement:** Built-in critique and revision ensure continuous response enhancement.  
- **Clarity and Adaptability:** Tagging system provides organized, adaptable responses tailored to user needs.  
- **Creative Flexibility:** Supports dynamic problem-solving with the ability to introduce new tags and concepts.  

---

## Use Cases  

The model is designed for various domains, including:  
1. **Research and Analysis:** Extracting insights and providing structured explanations.  
2. **Education:** Assisting with tutoring by breaking down complex problems step-by-step.  
3. **Problem-Solving:** Offering logical and actionable solutions for multi-step challenges.  
4. **Content Generation:** Producing clear, well-organized creative or professional content.  

---

## Training Details  

- **Frameworks:**  
- [Unsloth](https://github.com/unslothai/unsloth) for accelerated training.  
- Hugging Face Transformers and the TRL library for reinforcement learning with human feedback (RLHF).  

- **Dataset:** Finetuned on diverse reasoning-focused tasks, including logical puzzles, mathematical problems, and commonsense reasoning scenarios.  

- **Hardware Efficiency:**  
- Trained with bnb-4bit precision for reduced memory usage.  
- Optimized training pipeline achieving 2x faster development cycles.  

---

## Limitations  

- **Hallucinations** Model might hallucinate in very long context problems.
- **Unclosed tags** As the model gets deep into thinking and reflecting ,it has a tendency to not close thinking or critique tags .
- **Tags Compression** As the model gets confident in the answer , it will use less and less tags and might have everything in the <Thinking> Tag ,instead of reasoning and going step by step.
- **High Resource** This Model is Resource intensive and needs a lot of uninterrupted computing , since it's continuously generating tokens to reason , so it might work the best with consumer hardware.
---

## Ethical Considerations  

- **Transparency:** Responses are structured for verifiability through tagging.  
- **Bias Mitigation:** Includes self-critique to minimize biases and ensure fairness.  
- **Safe Deployment:** Users are encouraged to evaluate outputs to prevent harm or misinformation.  

---

## License  

This model is distributed under the Apache 2.0 license, allowing users to use, modify, and share it in compliance with the license terms.  

---

## Acknowledgments  

Special thanks to:  
- [Unsloth](https://github.com/unslothai/unsloth) for accelerated training workflows.  
- Hugging Face for their powerful tools and libraries.  

---

Experience the **PathfinderAI l**, leveraging its structured reasoning and self-improvement capabilities for any task requiring advanced AI reasoning.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Daemontatox__PathfinderAI-details)!
Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox/PathfinderAI)!

|      Metric       |% Value|
|-------------------|------:|
|Avg.               |  38.13|
|IFEval (0-Shot)    |  37.45|
|BBH (3-Shot)       |  52.65|
|MATH Lvl 5 (4-Shot)|  47.58|
|GPQA (0-shot)      |  19.24|
|MuSR (0-shot)      |  20.83|
|MMLU-PRO (5-shot)  |  51.04|