aashish1904's picture
Upload README.md with huggingface_hub
94d2b9e verified
|
raw
history blame
5.57 kB
metadata
license: creativeml-openrail-m
datasets:
  - microsoft/orca-math-word-problems-200k
language:
  - en
base_model:
  - allenai/Llama-3.1-Tulu-3-8B
pipeline_tag: text-generation
library_name: transformers
tags:
  - safetensors
  - math
  - tulu
  - trl
  - llama
  - text-generation-inference
  - math_lingo

QuantFactory Banner

QuantFactory/Tulu-MathLingo-8B-GGUF

This is quantized version of prithivMLmods/Tulu-MathLingo-8B created using llama.cpp

Original Model Card

Tulu-MathLingo-8B Model Files

The Tulu-MathLingo-8B model is a fine-tuned version of meta-llama/Llama-3.1-8B, optimized for solving mathematical word problems and reasoning tasks in English and the Tulu language. The model integrates advanced language understanding and reasoning capabilities with a focus on providing solutions to math-related queries.

File Name Size Description Upload Status
.gitattributes 1.57 kB Configures LFS tracking for large files. Updated
README.md 292 Bytes Basic details about the uploaded model. Updated
config.json 988 Bytes Contains model architecture and metadata. Uploaded
generation_config.json 241 Bytes Parameters for text generation (e.g., length, temperature). Uploaded
model-00001-of-00004.safetensors 4.98 GB Part 1 of model weights. Uploaded (LFS)
model-00002-of-00004.safetensors 5 GB Part 2 of model weights. Uploaded (LFS)
model-00003-of-00004.safetensors 4.92 GB Part 3 of model weights. Uploaded (LFS)
model-00004-of-00004.safetensors 1.17 GB Part 4 of model weights. Uploaded (LFS)
model.safetensors.index.json 25.4 kB Index file for multi-part model weights. Uploaded
special_tokens_map.json 462 Bytes Maps special tokens (e.g., <PAD>, <EOS>). Uploaded
tokenizer.json 17.2 MB Full tokenizer configuration. Uploaded (LFS)
tokenizer_config.json 57.6 kB Metadata for tokenizer usage. Uploaded

Sample Solve

xvxv.png

Key Features

  1. Multilingual Math Reasoning:

    • Designed for solving complex math problems in English and Tulu.
  2. Text Generation:

    • Generates detailed and contextually accurate text responses.
  3. Fine-Tuned Specializations:

    • Trained on the microsoft/orca-math-word-problems-200k dataset for word problem-solving.
  4. Special Token Mapping:

    • Configured to use tokens for specific functions such as <PAD> and <EOS> effectively.
  5. Secure and Efficient Storage:

    • Model weights are stored in the Safetensors format for secure and faster inference.
  6. Large Parameter Size:

    • 8.03 billion parameters enable handling complex queries and multi-turn conversations.

Training Details

  • Base Model: meta-llama/Llama-3.1-8B

  • Fine-Tuned:

    • Through multiple stages: SFT (Supervised Fine-Tuning) and DPO (Direct Preference Optimization).
  • Dataset:

    • Trained on 200k word problems from the Microsoft Orca Math Word Problems Dataset.
  • Model Size:

    • 8.03B parameters, optimized for FP16 tensor type.

Applications

  1. Mathematical Word Problems:

    • Solve structured or unstructured math problems in natural language.
  2. Conversational AI for Math:

    • Engage users in interactive dialogues focused on math and logic reasoning.
  3. Multilingual Support:

    • Supports queries in Tulu and English, enhancing accessibility.
  4. Education Tools:

    • Useful in tutoring systems for math, helping students with problem-solving.

Usage

Loading the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Tulu-MathLingo-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="fp16")

Math Word Problem
query = "If a train travels 60 miles in 2 hours, what is its average speed?"
inputs = tokenizer(query, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Answer:", response)

Performance Requirements

  • Hardware:

    • Requires a GPU with at least 24GB VRAM for optimal performance due to model size and FP16 usage.
  • Optimization:

    • Use mixed precision (fp16) for reduced memory footprint.
    • Split inference across multiple GPUs if necessary.