Training in progress, step 500

Browse files

Files changed (16) hide show

README.md +37 -17
adapter_config.json +2 -2
adapter_model.safetensors +1 -1
generation_config.json +6 -0
runs/Nov19_13-49-20_localhost/events.out.tfevents.1732004362.localhost +3 -0
runs/Nov19_13-52-01_localhost/events.out.tfevents.1732004522.localhost +3 -0
runs/Nov19_14-03-23_localhost/events.out.tfevents.1732005204.localhost +3 -0
runs/Nov19_14-13-30_localhost/events.out.tfevents.1732005811.localhost +3 -0
runs/Nov21_13-53-49_localhost/events.out.tfevents.1732177431.localhost +3 -0
runs/Nov21_16-15-04_localhost/events.out.tfevents.1732185906.localhost +3 -0
runs/Nov21_17-19-19_localhost/events.out.tfevents.1732189761.localhost +3 -0
runs/Nov21_17-23-58_localhost/events.out.tfevents.1732190040.localhost +3 -0
runs/Nov21_18-47-25_localhost/events.out.tfevents.1732195047.localhost +3 -0
runs/Nov21_19-13-55_localhost/events.out.tfevents.1732196637.localhost +3 -0
runs/Nov21_20-47-21_localhost/events.out.tfevents.1732202243.localhost +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
-base_model: bigcode/starcoderbase-1b
-library_name: peft
 license: bigcode-openrail-m
 tags:
 - generated_from_trainer
 model-index:
@@ -14,14 +14,9 @@ should probably proofread and complete it, then remove this comment. -->
 # peft-starcoder-finetuned
-This model is a fine-tuned version of [bigcode/starcoderbase-1b](https://huggingface.co/bigcode/starcoderbase-1b) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- eval_loss: nan
-- eval_runtime: 51.0638
-- eval_samples_per_second: 1.43
-- eval_steps_per_second: 1.43
-- epoch: 0.3
-- step: 60
 ## Model description
@@ -40,21 +35,46 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0005
-- train_batch_size: 1
-- eval_batch_size: 1
 - seed: 42
 - gradient_accumulation_steps: 8
-- total_train_batch_size: 8
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 10
-- training_steps: 200
 ### Framework versions
-- PEFT 0.13.2
 - Transformers 4.46.2
 - Pytorch 2.4.1+cu121
 - Datasets 3.0.1
-- Tokenizers 0.20.3

 ---
+library_name: transformers
 license: bigcode-openrail-m
+base_model: bigcode/starcoderbase-1b
 tags:
 - generated_from_trainer
 model-index:
 # peft-starcoder-finetuned
+This model is a fine-tuned version of [bigcode/starcoderbase-1b](https://huggingface.co/bigcode/starcoderbase-1b) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.5923
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 2
+- eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 8
+- total_train_batch_size: 16
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 50
+- training_steps: 2000
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 0.5622        | 0.1992 | 100  | 1.2485          |
+| 0.4589        | 0.3984 | 200  | 1.2126          |
+| 0.4216        | 0.5976 | 300  | 1.2730          |
+| 0.3743        | 0.7968 | 400  | 1.2278          |
+| 0.3535        | 0.9960 | 500  | 1.2615          |
+| 0.3011        | 1.1952 | 600  | 1.2960          |
+| 0.2653        | 1.3944 | 700  | 1.3112          |
+| 0.2734        | 1.5936 | 800  | 1.3759          |
+| 0.2855        | 1.7928 | 900  | 1.3015          |
+| 0.2528        | 1.9920 | 1000 | 1.3470          |
+| 0.2083        | 2.1912 | 1100 | 1.4719          |
+| 0.2318        | 2.3904 | 1200 | 1.4494          |
+| 0.1935        | 2.5896 | 1300 | 1.4621          |
+| 0.1809        | 2.7888 | 1400 | 1.4829          |
+| 0.227         | 2.9880 | 1500 | 1.4911          |
+| 0.1813        | 3.1873 | 1600 | 1.5903          |
+| 0.1893        | 3.3865 | 1700 | 1.5906          |
+| 0.1674        | 3.5857 | 1800 | 1.5916          |
+| 0.1723        | 3.7849 | 1900 | 1.5921          |
+| 0.1843        | 3.9841 | 2000 | 1.5923          |
 ### Framework versions
 - Transformers 4.46.2
 - Pytorch 2.4.1+cu121
 - Datasets 3.0.1
+- Tokenizers 0.20.3

adapter_config.json CHANGED Viewed

@@ -21,9 +21,9 @@
   "revision": null,
   "target_modules": [
     "c_proj",
-    "c_attn",
     "q_attn",
-    "c_fc"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "revision": null,
   "target_modules": [
     "c_proj",
     "q_attn",
+    "c_fc",
+    "c_attn"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1f148bdf323d834952224e2ae4b017348008fc4c11c8fa07d046a363b9c3ad9f
 size 22241240

 version https://git-lfs.github.com/spec/v1
+oid sha256:4aa5c58bfd5a52bab356a7c562fddb9d55aff986990047dfb66806b039f01ec3
 size 22241240

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 0,
+  "eos_token_id": 0,
+  "transformers_version": "4.46.2"
+}

runs/Nov19_13-49-20_localhost/events.out.tfevents.1732004362.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0ddcc82a3767ffcfcebea0b73be76f7569da96b0fd850216e28054df9137d039
+size 5400

runs/Nov19_13-52-01_localhost/events.out.tfevents.1732004522.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b08464490734767d7ad0a624e90d1c96850466c0d5c6ab8ee2b9dd4cc9d7bc07
+size 5400

runs/Nov19_14-03-23_localhost/events.out.tfevents.1732005204.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d37887c67dcf2e336ffd7287f4845f8c328816f1c1040a81f4ba5b523857e029
+size 5400

runs/Nov19_14-13-30_localhost/events.out.tfevents.1732005811.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7c3dcdc6287506eaf736213323da1c5f4303db5e2421f4795db654548bb263b2
+size 17714

runs/Nov21_13-53-49_localhost/events.out.tfevents.1732177431.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6e421491fe4a657ad225f5c6f3bfb03ac2e451a2813860ed195e089853b66cf6
+size 5405

runs/Nov21_16-15-04_localhost/events.out.tfevents.1732185906.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:92814aaf00a38c012bc7a40a7d5350a0d83f628119b57676b7b38435481d7907
+size 8758

runs/Nov21_17-19-19_localhost/events.out.tfevents.1732189761.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ac07efe0c79f07c01b947d3b1523b165d857c4427f92e2c06208703130ac86ab
+size 7312

runs/Nov21_17-23-58_localhost/events.out.tfevents.1732190040.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:318969b3f06cc6d724d1085c27dd9795fabc82e0599dc4ae46448f01b3e6ccd8
+size 7312

runs/Nov21_18-47-25_localhost/events.out.tfevents.1732195047.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:39d8915a487f2dd4d74faa4a599e1d152f70597deec10156c225d845dc9f5075
+size 6348

runs/Nov21_19-13-55_localhost/events.out.tfevents.1732196637.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d9f212c6546bc67bf958406bbf8616d79425eb62ad9ceb4792727fcf473fe006
+size 10558

runs/Nov21_20-47-21_localhost/events.out.tfevents.1732202243.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:25f4f53257444d95d6014be2421b8999b4153bdae95d35c10c00c9e68bf2a68d
+size 10205

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:aa42ca44218fc8c801de5af36377d87895d26ab0eb0d7aab7ba93fb9d218be55
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:35a43dbd2422827738d27e88f16c12c5c64b18e213fdd77b162b4b2fbdd0a49f
 size 5304