Update README.md
Browse files
README.md
CHANGED
@@ -91,9 +91,9 @@ It is also unknown what the size and composition of the corpus was used to train
|
|
91 |
|
92 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
93 |
|
94 |
-
We used [UltraChat-ITA](https://huggingface.co/datasets/giux78/100k-sft-ready-ultrafeedback-ita) as training data that is a filtered
|
95 |
-
For translating the dataset we combined different tools and API we are also evaluating the best
|
96 |
-
We have seen that the translation phase can
|
97 |
|
98 |
#### Summary
|
99 |
Zefiro-7b-beta-ITA-v0.1 is finetuned version of mistral-7b using the zephyr approach for the italian language.
|
|
|
91 |
|
92 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
93 |
|
94 |
+
We used [UltraChat-ITA](https://huggingface.co/datasets/giux78/100k-sft-ready-ultrafeedback-ita) as training data that is a filtered version of the [`UltraChat`](https://huggingface.co/datasets/stingning/ultrachat).
|
95 |
+
For translating the dataset we combined different tools and API we are also evaluating the best approach for translating many more datasets.
|
96 |
+
We have seen that the translation phase is critical and can introduce incorrect syntax and semantics.
|
97 |
|
98 |
#### Summary
|
99 |
Zefiro-7b-beta-ITA-v0.1 is finetuned version of mistral-7b using the zephyr approach for the italian language.
|