robinsmits commited on
Commit
01f632e
1 Parent(s): 7bdfdf1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -16,7 +16,8 @@ tags:
16
 
17
  ## Model description
18
 
19
- This adapter model is a fine-tuned version of [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b) on the [BramVanroy/alpaca-cleaned-dutch](https://www.huggingface.co/datasets/BramVanroy/alpaca-cleaned-dutch) dataset.
 
20
 
21
  See [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b) for all information about the base model.
22
 
@@ -51,15 +52,17 @@ For more extensive usage and a lot of generated samples (both good and bad sampl
51
  ## Intended uses & limitations
52
 
53
  The open_llama_7b model was primarily trained on the English language. Part of the dataset was a Wikipedia dump containing pages in 20 languages.
54
- Dutch was one of those languages. Given the size of the total dataset and the wikipedia part the Dutch language was very likely less than 0.5% of the total data.
55
 
56
- The primary intention of this model is to explore the use of the Dutch language in combination with an Open LLM.
 
 
57
 
58
  ## Training and evaluation data
59
 
60
  This model was trained on the [BramVanroy/alpaca-cleaned-dutch](https://www.huggingface.co/datasets/BramVanroy/alpaca-cleaned-dutch) dataset.
61
 
62
- Commercial use is forbidden. This model is intended for research only.
63
 
64
  ## Training procedure
65
 
 
16
 
17
  ## Model description
18
 
19
+ This adapter model is a fine-tuned version of [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b).
20
+ Finetuning was performed on the Dutch [BramVanroy/alpaca-cleaned-dutch](https://www.huggingface.co/datasets/BramVanroy/alpaca-cleaned-dutch) dataset which contains 52K of records with instruction following-data translated from English to Dutch.
21
 
22
  See [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b) for all information about the base model.
23
 
 
52
  ## Intended uses & limitations
53
 
54
  The open_llama_7b model was primarily trained on the English language. Part of the dataset was a Wikipedia dump containing pages in 20 languages.
55
+ Dutch was one of those languages. Given the size of the total dataset and the wikipedia part the Dutch language was very likely less than 0.5% of the total data.
56
 
57
+ The generated output and performance of this model for the Dutch language is very likely not always comparable to the various Open-Llama models that have been finetuned on English Alpaca datasets.
58
+
59
+ The primary intention of this model is to explore and research the use of the Dutch language in combination with an Open LLM model.
60
 
61
  ## Training and evaluation data
62
 
63
  This model was trained on the [BramVanroy/alpaca-cleaned-dutch](https://www.huggingface.co/datasets/BramVanroy/alpaca-cleaned-dutch) dataset.
64
 
65
+ Based on the dataset license only Non-Commercial use is allowed. Commercial use is strictly forbidden.
66
 
67
  ## Training procedure
68