fatihbicer
/

distilgpt2-finetuned

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

fatihbicer commited on Jul 6

Commit

e537d3a

•

1 Parent(s): eb4f5c9

End of training

Files changed (1) hide show

README.md +8 -6

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1901
 ## Model description
@@ -42,15 +42,17 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 2
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| No log        | 0.96  | 18   | 0.4368          |
-| No log        | 1.92  | 36   | 0.1901          |
 ### Framework versions

 This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1530
 ## Model description
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 4
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| No log        | 0.96   | 18   | 0.3476          |
+| No log        | 1.9733 | 37   | 0.1676          |
+| No log        | 2.9867 | 56   | 0.1567          |
+| No log        | 3.84   | 72   | 0.1530          |
 ### Framework versions