robinsmits
commited on
Commit
•
01f632e
1
Parent(s):
7bdfdf1
Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,8 @@ tags:
|
|
16 |
|
17 |
## Model description
|
18 |
|
19 |
-
This adapter model is a fine-tuned version of [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b)
|
|
|
20 |
|
21 |
See [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b) for all information about the base model.
|
22 |
|
@@ -51,15 +52,17 @@ For more extensive usage and a lot of generated samples (both good and bad sampl
|
|
51 |
## Intended uses & limitations
|
52 |
|
53 |
The open_llama_7b model was primarily trained on the English language. Part of the dataset was a Wikipedia dump containing pages in 20 languages.
|
54 |
-
Dutch was one of those languages. Given the size of the total dataset and the wikipedia part the Dutch language was very likely less than 0.5% of the total data.
|
55 |
|
56 |
-
The
|
|
|
|
|
57 |
|
58 |
## Training and evaluation data
|
59 |
|
60 |
This model was trained on the [BramVanroy/alpaca-cleaned-dutch](https://www.huggingface.co/datasets/BramVanroy/alpaca-cleaned-dutch) dataset.
|
61 |
|
62 |
-
Commercial use is
|
63 |
|
64 |
## Training procedure
|
65 |
|
|
|
16 |
|
17 |
## Model description
|
18 |
|
19 |
+
This adapter model is a fine-tuned version of [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b).
|
20 |
+
Finetuning was performed on the Dutch [BramVanroy/alpaca-cleaned-dutch](https://www.huggingface.co/datasets/BramVanroy/alpaca-cleaned-dutch) dataset which contains 52K of records with instruction following-data translated from English to Dutch.
|
21 |
|
22 |
See [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b) for all information about the base model.
|
23 |
|
|
|
52 |
## Intended uses & limitations
|
53 |
|
54 |
The open_llama_7b model was primarily trained on the English language. Part of the dataset was a Wikipedia dump containing pages in 20 languages.
|
55 |
+
Dutch was one of those languages. Given the size of the total dataset and the wikipedia part the Dutch language was very likely less than 0.5% of the total data.
|
56 |
|
57 |
+
The generated output and performance of this model for the Dutch language is very likely not always comparable to the various Open-Llama models that have been finetuned on English Alpaca datasets.
|
58 |
+
|
59 |
+
The primary intention of this model is to explore and research the use of the Dutch language in combination with an Open LLM model.
|
60 |
|
61 |
## Training and evaluation data
|
62 |
|
63 |
This model was trained on the [BramVanroy/alpaca-cleaned-dutch](https://www.huggingface.co/datasets/BramVanroy/alpaca-cleaned-dutch) dataset.
|
64 |
|
65 |
+
Based on the dataset license only Non-Commercial use is allowed. Commercial use is strictly forbidden.
|
66 |
|
67 |
## Training procedure
|
68 |
|