Maelstrome's picture
Update README.md
577d99b verified
|
raw
history blame
1.98 kB
metadata
license: gemma
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: google/gemma-2b
datasets:
  - generator
model-index:
  - name: gemma-2b-storytelling
    results: []

gemma-2b-storytelling

This model is a fine-tuned version of google/gemma-2b on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: nan

Model description

This model has been fine-tuned specifically for the task of text generation, focusing on various storytelling themes. It utilizes advanced language modeling techniques to produce coherent and contextually relevant narratives based on user prompts.

Intended uses & limitations

This model is intended for use in applications requiring high-quality narrative text generation, such as content creation, interactive storytelling, or game design. Users should be aware of potential limitations in the model's understanding of complex contexts or subtleties in language, which may affect the output quality.

Training and evaluation data

The model was trained using the PocketDoc/RUCAIBox-Story-Generation-Alpaca dataset, which contains diverse storytelling prompts and responses, ensuring a robust ability to generate varied narrative content.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.05
  • training_steps: 154

Training results

Training Loss Epoch Step Validation Loss
1454737970954.24 0.9164 100 nan

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.1
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1