File size: 5,567 Bytes
94d2b9e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
license: creativeml-openrail-m
datasets:
- microsoft/orca-math-word-problems-200k
language:
- en
base_model:
- allenai/Llama-3.1-Tulu-3-8B
pipeline_tag: text-generation
library_name: transformers
tags:
- safetensors
- math
- tulu
- trl
- llama
- text-generation-inference
- math_lingo
---
[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
# QuantFactory/Tulu-MathLingo-8B-GGUF
This is quantized version of [prithivMLmods/Tulu-MathLingo-8B](https://huggingface.co/prithivMLmods/Tulu-MathLingo-8B) created using llama.cpp
# Original Model Card
# Tulu-MathLingo-8B Model Files
The **Tulu-MathLingo-8B** model is a fine-tuned version of **meta-llama/Llama-3.1-8B**, optimized for solving mathematical word problems and reasoning tasks in English and the Tulu language. The model integrates advanced language understanding and reasoning capabilities with a focus on providing solutions to math-related queries.
| **File Name** | **Size** | **Description** | **Upload Status** |
|-----------------------------------|--------------|------------------------------------------------|-------------------|
| `.gitattributes` | 1.57 kB | Configures LFS tracking for large files. | Updated |
| `README.md` | 292 Bytes | Basic details about the uploaded model. | Updated |
| `config.json` | 988 Bytes | Contains model architecture and metadata. | Uploaded |
| `generation_config.json` | 241 Bytes | Parameters for text generation (e.g., length, temperature). | Uploaded |
| `model-00001-of-00004.safetensors`| 4.98 GB | Part 1 of model weights. | Uploaded (LFS) |
| `model-00002-of-00004.safetensors`| 5 GB | Part 2 of model weights. | Uploaded (LFS) |
| `model-00003-of-00004.safetensors`| 4.92 GB | Part 3 of model weights. | Uploaded (LFS) |
| `model-00004-of-00004.safetensors`| 1.17 GB | Part 4 of model weights. | Uploaded (LFS) |
| `model.safetensors.index.json` | 25.4 kB | Index file for multi-part model weights. | Uploaded |
| `special_tokens_map.json` | 462 Bytes | Maps special tokens (e.g., `<PAD>`, `<EOS>`). | Uploaded |
| `tokenizer.json` | 17.2 MB | Full tokenizer configuration. | Uploaded (LFS) |
| `tokenizer_config.json` | 57.6 kB | Metadata for tokenizer usage. | Uploaded |
### Sample Solve
![xvxv.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/vX8m-ltsacAztTF9SqDxB.png)
### **Key Features**
1. **Multilingual Math Reasoning:**
- Designed for solving complex math problems in **English** and **Tulu**.
2. **Text Generation:**
- Generates detailed and contextually accurate text responses.
3. **Fine-Tuned Specializations:**
- Trained on the **microsoft/orca-math-word-problems-200k** dataset for word problem-solving.
4. **Special Token Mapping:**
- Configured to use tokens for specific functions such as `<PAD>` and `<EOS>` effectively.
5. **Secure and Efficient Storage:**
- Model weights are stored in the **Safetensors** format for secure and faster inference.
6. **Large Parameter Size:**
- 8.03 billion parameters enable handling complex queries and multi-turn conversations.
---
### **Training Details**
- **Base Model:** [meta-llama/Llama-3.1-8B](#)
- **Fine-Tuned:**
- Through multiple stages: **SFT (Supervised Fine-Tuning)** and **DPO (Direct Preference Optimization)**.
- **Dataset:**
- Trained on **200k word problems** from the **Microsoft Orca Math Word Problems Dataset**.
- **Model Size:**
- 8.03B parameters, optimized for **FP16** tensor type.
---
### **Applications**
1. **Mathematical Word Problems:**
- Solve structured or unstructured math problems in natural language.
2. **Conversational AI for Math:**
- Engage users in interactive dialogues focused on math and logic reasoning.
3. **Multilingual Support:**
- Supports queries in **Tulu** and **English**, enhancing accessibility.
4. **Education Tools:**
- Useful in tutoring systems for math, helping students with problem-solving.
---
### **Usage**
#### **Loading the Model**
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "prithivMLmods/Tulu-MathLingo-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="fp16")
```
---
##### **Math Word Problem**
```python
query = "If a train travels 60 miles in 2 hours, what is its average speed?"
inputs = tokenizer(query, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Answer:", response)
```
### **Performance Requirements**
- **Hardware:**
- Requires a GPU with at least **24GB VRAM** for optimal performance due to model size and FP16 usage.
- **Optimization:**
- Use mixed precision (`fp16`) for reduced memory footprint.
- Split inference across multiple GPUs if necessary.
---
|