BramVanroy
commited on
Commit
•
e176936
1
Parent(s):
856b940
Update README.md
Browse files
README.md
CHANGED
@@ -49,11 +49,15 @@ More information needed
|
|
49 |
|
50 |
## Intended uses & limitations
|
51 |
|
52 |
-
|
53 |
|
54 |
## Training and evaluation data
|
55 |
|
56 |
-
|
|
|
|
|
|
|
|
|
57 |
|
58 |
## Training procedure
|
59 |
|
|
|
49 |
|
50 |
## Intended uses & limitations
|
51 |
|
52 |
+
The same limitations as [phi-2](https://huggingface.co/microsoft/phi-2#limitations-of-phi-2), and LLMs in general, apply here. LLMs hallucinate, make mistakes, and should not be trusted. Use at your own risk!
|
53 |
|
54 |
## Training and evaluation data
|
55 |
|
56 |
+
Fietje 2B instruct was finetuned from [the base model](https://huggingface.co/BramVanroy/fietje-2b) on the following datasets. Number of training samples per dataset given in brackets, totalling 201,579.
|
57 |
+
|
58 |
+
- [BramVanroy/ultrachat_200k_dutch](https://huggingface.co/datasets/BramVanroy/ultrachat_200k_dutch): gpt-4-1106-preview; multi-turn; fully generated (192,598)
|
59 |
+
- [BramVanroy/no_robots_dutch](https://huggingface.co/datasets/BramVanroy/no_robots_dutch): gpt-4-1106-preview; prompt translate, answer generated; some items have system messages (8181)
|
60 |
+
- [BramVanroy/belebele_dutch](https://huggingface.co/datasets/BramVanroy/belebele_dutch): Dutch portion of [belebele](https://huggingface.co/datasets/facebook/belebele), formatted into SFT format (800)
|
61 |
|
62 |
## Training procedure
|
63 |
|