MaRyAm1295
/

Llama-3.2-3B-KAM

Generated from Trainer

4-bit precision

Model card Files Files and versions Community

KAM-Llama3.2-3B

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the None dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 1
eval_batch_size: 8
seed: 3407
gradient_accumulation_steps: 8
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 20
training_steps: 1200
mixed_precision_training: Native AMP

Training results

Step Training Loss

50 2.436700
100 2.103400
150 2.048900
200 2.041700
250 2.002900
300 1.991700
350 1.977400
400 1.974500
450 1.945000
500 1.951100
550 1.950700
600 1.943000
650 1.927900
700 1.920900
750 1.903400
800 1.896000
850 1.910800
900 1.904600
950 1.918100
1000 1.911500
1050 1.909100
1100 1.928900
1150 1.896100
1200 1.876700

Framework versions

PEFT 0.13.2
Transformers 4.44.2
Pytorch 2.5.0+cu121
Datasets 3.0.2
Tokenizers 0.19.1

Downloads last month: 8

Safetensors

Model size

1.87B params

Tensor type

F32

·

U8

·

Inference API

Unable to determine this model’s pipeline type. Check the docs .

Model tree for MaRyAm1295/Llama-3.2-3B-KAM

Base model

meta-llama/Llama-3.2-3B-Instruct

Adapter

(63)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard