Maelstrome
commited on
Commit
•
577d99b
1
Parent(s):
dfd430c
Update README.md
Browse files
README.md
CHANGED
@@ -13,9 +13,6 @@ model-index:
|
|
13 |
results: []
|
14 |
---
|
15 |
|
16 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
17 |
-
should probably proofread and complete it, then remove this comment. -->
|
18 |
-
|
19 |
# gemma-2b-storytelling
|
20 |
|
21 |
This model is a fine-tuned version of [google/gemma-2b](https://huggingface.co/google/gemma-2b) on the generator dataset.
|
@@ -24,15 +21,15 @@ It achieves the following results on the evaluation set:
|
|
24 |
|
25 |
## Model description
|
26 |
|
27 |
-
|
28 |
|
29 |
## Intended uses & limitations
|
30 |
|
31 |
-
|
32 |
|
33 |
## Training and evaluation data
|
34 |
|
35 |
-
|
36 |
|
37 |
## Training procedure
|
38 |
|
@@ -45,7 +42,7 @@ The following hyperparameters were used during training:
|
|
45 |
- seed: 42
|
46 |
- gradient_accumulation_steps: 8
|
47 |
- total_train_batch_size: 32
|
48 |
-
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
49 |
- lr_scheduler_type: linear
|
50 |
- lr_scheduler_warmup_ratio: 0.05
|
51 |
- training_steps: 154
|
@@ -56,11 +53,10 @@ The following hyperparameters were used during training:
|
|
56 |
|:----------------:|:------:|:----:|:---------------:|
|
57 |
| 1454737970954.24 | 0.9164 | 100 | nan |
|
58 |
|
59 |
-
|
60 |
### Framework versions
|
61 |
|
62 |
- PEFT 0.10.0
|
63 |
- Transformers 4.40.1
|
64 |
- Pytorch 2.2.2+cu121
|
65 |
- Datasets 2.19.0
|
66 |
-
- Tokenizers 0.19.1
|
|
|
13 |
results: []
|
14 |
---
|
15 |
|
|
|
|
|
|
|
16 |
# gemma-2b-storytelling
|
17 |
|
18 |
This model is a fine-tuned version of [google/gemma-2b](https://huggingface.co/google/gemma-2b) on the generator dataset.
|
|
|
21 |
|
22 |
## Model description
|
23 |
|
24 |
+
This model has been fine-tuned specifically for the task of text generation, focusing on various storytelling themes. It utilizes advanced language modeling techniques to produce coherent and contextually relevant narratives based on user prompts.
|
25 |
|
26 |
## Intended uses & limitations
|
27 |
|
28 |
+
This model is intended for use in applications requiring high-quality narrative text generation, such as content creation, interactive storytelling, or game design. Users should be aware of potential limitations in the model's understanding of complex contexts or subtleties in language, which may affect the output quality.
|
29 |
|
30 |
## Training and evaluation data
|
31 |
|
32 |
+
The model was trained using the `PocketDoc/RUCAIBox-Story-Generation-Alpaca` dataset, which contains diverse storytelling prompts and responses, ensuring a robust ability to generate varied narrative content.
|
33 |
|
34 |
## Training procedure
|
35 |
|
|
|
42 |
- seed: 42
|
43 |
- gradient_accumulation_steps: 8
|
44 |
- total_train_batch_size: 32
|
45 |
+
- optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
|
46 |
- lr_scheduler_type: linear
|
47 |
- lr_scheduler_warmup_ratio: 0.05
|
48 |
- training_steps: 154
|
|
|
53 |
|:----------------:|:------:|:----:|:---------------:|
|
54 |
| 1454737970954.24 | 0.9164 | 100 | nan |
|
55 |
|
|
|
56 |
### Framework versions
|
57 |
|
58 |
- PEFT 0.10.0
|
59 |
- Transformers 4.40.1
|
60 |
- Pytorch 2.2.2+cu121
|
61 |
- Datasets 2.19.0
|
62 |
+
- Tokenizers 0.19.1
|