README.md · Daemontatox/PathfinderAI at main

File size: 7,958 Bytes

---
base_model:
- Qwen/QwQ-32B-Preview
tags:
- text-generation-inference
- transformers
- unsloth
- trl
- COT
- Reasoning
- Smart
- Qwen
- QwQ
license: apache-2.0
language:
- en
datasets:
- Daemontatox/LongCOT-Reason
metrics:
- accuracy
- character
library_name: transformers
pipeline_tag: text-generation
model-index:
- name: PathfinderAI
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: IFEval (0-Shot)
      type: HuggingFaceH4/ifeval
      args:
        num_few_shot: 0
    metrics:
    - type: inst_level_strict_acc and prompt_level_strict_acc
      value: 37.45
      name: strict accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: BBH (3-Shot)
      type: BBH
      args:
        num_few_shot: 3
    metrics:
    - type: acc_norm
      value: 52.65
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MATH Lvl 5 (4-Shot)
      type: hendrycks/competition_math
      args:
        num_few_shot: 4
    metrics:
    - type: exact_match
      value: 47.58
      name: exact match
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GPQA (0-shot)
      type: Idavidrein/gpqa
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 19.24
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MuSR (0-shot)
      type: TAUR-Lab/MuSR
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 20.83
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU-PRO (5-shot)
      type: TIGER-Lab/MMLU-Pro
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 51.04
      name: accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/PathfinderAI
      name: Open LLM Leaderboard
---

![image](./image.webp)

# PathfinderAI   

- **Developed by:** Daemontatox  
- **License:** Apache 2.0   
- **Finetuned Using:** [Unsloth](https://github.com/unslothai/unsloth), Hugging Face Transformers, and TRL Library  

## Model Overview  

The **PathfinderAI  Model** is an advanced AI system optimized for logical reasoning, multi-step problem-solving, and decision-making tasks. Designed with efficiency and accuracy in mind, it employs a structured system prompt to ensure high-quality answers through a transparent and iterative thought process.  

### System Prompt and Workflow  
 
This model operates using an innovative reasoning framework structured around the following steps:  

1. **Initial Thought:**  
   The model uses `<Thinking>` tags to reason step-by-step and craft its best possible response.  
   Example:  

2. **Self-Critique:**  
It evaluates its initial response within `<Critique>` tags, focusing on:  
- **Accuracy:** Is it factually correct and verifiable?  
- **Clarity:** Is it clear and free of ambiguity?  
- **Completeness:** Does it fully address the request?  
- **Improvement:** What can be enhanced?  
Example:  

3. **Revision:**  
Based on the critique, the model refines its response within `<Revising>` tags.  
Example:  

4. **Final Response:**  
The revised response is presented clearly within `<Final>` tags.  
Example:  

5. **Tag Innovation:**  
When needed, the model creates and defines new tags for better structuring or clarity, ensuring consistent usage.  
Example:  

### Key Features  
- **Structured Reasoning:** Transparent, multi-step approach for generating and refining answers.  
- **Self-Improvement:** Built-in critique and revision ensure continuous response enhancement.  
- **Clarity and Adaptability:** Tagging system provides organized, adaptable responses tailored to user needs.  
- **Creative Flexibility:** Supports dynamic problem-solving with the ability to introduce new tags and concepts.  

---

## Use Cases  

The model is designed for various domains, including:  
1. **Research and Analysis:** Extracting insights and providing structured explanations.  
2. **Education:** Assisting with tutoring by breaking down complex problems step-by-step.  
3. **Problem-Solving:** Offering logical and actionable solutions for multi-step challenges.  
4. **Content Generation:** Producing clear, well-organized creative or professional content.  

---

## Training Details  

- **Frameworks:**  
- [Unsloth](https://github.com/unslothai/unsloth) for accelerated training.  
- Hugging Face Transformers and the TRL library for reinforcement learning with human feedback (RLHF).  

- **Dataset:** Finetuned on diverse reasoning-focused tasks, including logical puzzles, mathematical problems, and commonsense reasoning scenarios.  

- **Hardware Efficiency:**  
- Trained with bnb-4bit precision for reduced memory usage.  
- Optimized training pipeline achieving 2x faster development cycles.  

---

## Limitations  

- **Hallucinations** Model might hallucinate in very long context problems.
- **Unclosed tags** As the model gets deep into thinking and reflecting ,it has a tendency to not close thinking or critique tags .
- **Tags Compression** As the model gets confident in the answer , it will use less and less tags and might have everything in the <Thinking> Tag ,instead of reasoning and going step by step.
- **High Resource** This Model is Resource intensive and needs a lot of uninterrupted computing , since it's continuously generating tokens to reason , so it might work the best with consumer hardware.
---

## Ethical Considerations  

- **Transparency:** Responses are structured for verifiability through tagging.  
- **Bias Mitigation:** Includes self-critique to minimize biases and ensure fairness.  
- **Safe Deployment:** Users are encouraged to evaluate outputs to prevent harm or misinformation.  

---

## License  

This model is distributed under the Apache 2.0 license, allowing users to use, modify, and share it in compliance with the license terms.  

---

## Acknowledgments  

Special thanks to:  
- [Unsloth](https://github.com/unslothai/unsloth) for accelerated training workflows.  
- Hugging Face for their powerful tools and libraries.  

---

Experience the **PathfinderAI l**, leveraging its structured reasoning and self-improvement capabilities for any task requiring advanced AI reasoning.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Daemontatox__PathfinderAI-details)!
Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox/PathfinderAI)!

|      Metric       |% Value|
|-------------------|------:|
|Avg.               |  38.13|
|IFEval (0-Shot)    |  37.45|
|BBH (3-Shot)       |  52.65|
|MATH Lvl 5 (4-Shot)|  47.58|
|GPQA (0-shot)      |  19.24|
|MuSR (0-shot)      |  20.83|
|MMLU-PRO (5-shot)  |  51.04|