Edit model card

Model Card for Model ID

Model Details

Model Description

The model was fintuned from Llama-3.2-1B base model with SQL query data

How to Get Started with the Model

from transformers import pipeline
model_id = "itsme-nishanth/Sparky-SQL-Llama-3.2-1B"
pipe = pipeline("text-generation", model_id, device="cuda")
messages = [
    {"role": "user", "content": "list down the product names and its type provided by vendor 'vanhelsing' from 'products' table?"},
]
print(pipe(messages, max_new_tokens=100)[0]['generated_text'][-1])  # Print the assistant's response
  • Developed by: Nishanth
  • Model type: Llama
  • Language(s) (NLP): English
  • License: Apache license 2.0
  • Finetuned from model : Llama-3.2-1B

Training Details

Training Data

Training Procedure

Preprocessing

Dataset had empty records. Removed them before training.

Training Hyperparameters

  • Training regime:
  • gradient_accumulation_steps = 4,
  • warmup_steps = 5,
  • max_steps = 60,
  • learning_rate = 2e-4,
  • fp16 = not is_bfloat16_supported(),
  • bf16 = is_bfloat16_supported(),
  • optim = "adamw_8bit",
  • weight_decay = 0.01,
  • lr_scheduler_type = "linear",
  • seed = 3407

Technical Specifications

Hardware

  • Google-Colab (Tesla T4)

Software

  • Transformers
  • Unsloth

Model Card Contact

itsmenishanthkr@gmail.com

Downloads last month
35
Safetensors
Model size
1.24B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for itsme-nishanth/Sparky-SQL-Llama-3.2-1B

Finetuned
(15)
this model
Quantizations
1 model

Dataset used to train itsme-nishanth/Sparky-SQL-Llama-3.2-1B