parth0908's picture
Training in progress, step 500
5d22897 verified
|
raw
history blame
2.51 kB
metadata
library_name: transformers
license: bigcode-openrail-m
base_model: bigcode/starcoderbase-1b
tags:
  - generated_from_trainer
model-index:
  - name: peft-starcoder-finetuned
    results: []

peft-starcoder-finetuned

This model is a fine-tuned version of bigcode/starcoderbase-1b on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5923

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • training_steps: 2000

Training results

Training Loss Epoch Step Validation Loss
0.5622 0.1992 100 1.2485
0.4589 0.3984 200 1.2126
0.4216 0.5976 300 1.2730
0.3743 0.7968 400 1.2278
0.3535 0.9960 500 1.2615
0.3011 1.1952 600 1.2960
0.2653 1.3944 700 1.3112
0.2734 1.5936 800 1.3759
0.2855 1.7928 900 1.3015
0.2528 1.9920 1000 1.3470
0.2083 2.1912 1100 1.4719
0.2318 2.3904 1200 1.4494
0.1935 2.5896 1300 1.4621
0.1809 2.7888 1400 1.4829
0.227 2.9880 1500 1.4911
0.1813 3.1873 1600 1.5903
0.1893 3.3865 1700 1.5906
0.1674 3.5857 1800 1.5916
0.1723 3.7849 1900 1.5921
0.1843 3.9841 2000 1.5923

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.3