phanerozoic
commited on
Commit
•
7b6f41c
1
Parent(s):
2d5813a
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,61 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
widget:
|
6 |
+
- text: |
|
7 |
+
Who are you?
|
8 |
+
example_title: "Introduction Query"
|
9 |
---
|
10 |
+
![tinyviking.png](https://huggingface.co/phanerozoic/TinyViking-1.1B-v0.1/resolve/main/tinyviking.png)
|
11 |
+
# TinyViking-1.1B-v0.1
|
12 |
+
|
13 |
+
TinyViking-1.1B-v0.1 is a specialized language model designed for generating Viking-themed content. Developed by phanerozoic, this model is fine-tuned from TinyLlamaTinyLlama-1.1B-Chat-v1.0, optimized for environments with limited computing resources.
|
14 |
+
|
15 |
+
### Performance
|
16 |
+
TinyViking is capable of generating engaging Viking narratives, reflecting an understanding of Viking culture. However, it is not designed for general language tasks and may struggle with complex scientific or technical queries.
|
17 |
+
|
18 |
+
### Direct Use
|
19 |
+
Ideal for thematic language generation, particularly in settings like NPCs in games, where fun and thematic engagement are prioritized over detailed factual accuracy.
|
20 |
+
|
21 |
+
### Training Data
|
22 |
+
Trained on "The Saga of Grettir the Strong: Grettir's Saga" to ensure authentic thematic content.
|
23 |
+
|
24 |
+
### Custom Stopping Strings
|
25 |
+
Custom stopping strings are employed to enhance output quality:
|
26 |
+
- "},"
|
27 |
+
- "User:"
|
28 |
+
- "You:"
|
29 |
+
- "\nUser"
|
30 |
+
- "\nUser:"
|
31 |
+
- "me:"
|
32 |
+
- "user"
|
33 |
+
- "\n"
|
34 |
+
|
35 |
+
### Training Hyperparameters and Fine-Tuning Details
|
36 |
+
- **Learning Rate**: 2e-5
|
37 |
+
- **Epochs**: 1
|
38 |
+
- **Training Duration**: Approximately 5.6 minutes on an RTX 6000 Ada GPU
|
39 |
+
- **LoRA Rank**: 2048
|
40 |
+
- **LoRA Alpha**: 4096
|
41 |
+
- **LoRA Dropout**: 0.05
|
42 |
+
- **Cutoff Length**: 256
|
43 |
+
- **Batch Size**: 4 (micro batch size)
|
44 |
+
- **Warmup Steps**: 8
|
45 |
+
- **Optimizer**: adamw_torch
|
46 |
+
- **Gradient Accumulation Steps**: 1
|
47 |
+
|
48 |
+
### Limitations
|
49 |
+
Specialized in Viking dialect and narratives, TinyViking is less effective outside its thematic focus.
|
50 |
+
|
51 |
+
### Compute Infrastructure
|
52 |
+
Trained on an RTX 6000 Ada GPU, demonstrating the feasibility of specialized model training within resource-constrained environments.
|
53 |
+
|
54 |
+
### Results
|
55 |
+
Successfully generates Viking-themed responses, maintaining thematic consistency while displaying improved coherence and depth over previous models due to advancements in dataset generation and parsing.
|
56 |
+
|
57 |
+
### Summary
|
58 |
+
TinyViking-1.1B-v0.1 shows an improvement in quality compared to earlier models, thanks to a new dataset generation method that helps to conserve the base model's already tenuous ability to hold a conversation. While it excels in Viking-themed interactions, its specialized focus limits broader application.
|
59 |
+
|
60 |
+
### Acknowledgments
|
61 |
+
Gratitude to the TinyLlama team, whose foundational work was, as always, essential for developing TinyViking.
|