BramVanroy
/

fietje-2-instruct

Text Generation

alignment-handbook

text-generation-inference

Model card Files Files and versions Community

BramVanroy commited on Apr 29

Commit

e176936

•

1 Parent(s): 856b940

Update README.md

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -49,11 +49,15 @@ More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure

 ## Intended uses & limitations
+The same limitations as [phi-2](https://huggingface.co/microsoft/phi-2#limitations-of-phi-2), and LLMs in general, apply here. LLMs hallucinate, make mistakes, and should not be trusted. Use at your own risk!
 ## Training and evaluation data
+Fietje 2B instruct was finetuned from [the base model](https://huggingface.co/BramVanroy/fietje-2b) on the following datasets. Number of training samples per dataset given in brackets, totalling 201,579.
+- [BramVanroy/ultrachat_200k_dutch](https://huggingface.co/datasets/BramVanroy/ultrachat_200k_dutch): gpt-4-1106-preview; multi-turn; fully generated (192,598)
+- [BramVanroy/no_robots_dutch](https://huggingface.co/datasets/BramVanroy/no_robots_dutch): gpt-4-1106-preview; prompt translate, answer generated; some items have system messages (8181)
+- [BramVanroy/belebele_dutch](https://huggingface.co/datasets/BramVanroy/belebele_dutch): Dutch portion of [belebele](https://huggingface.co/datasets/facebook/belebele), formatted into SFT format (800)
 ## Training procedure