Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ It outperforms its multilingual counterparts, albeit being much smaller than oth
|
|
20 |
VBART-XLarge is created by adding extra Transformer layers between the layers of VBART-Large. Hence it was able to transfer learned weights from the smaller model while doublings its number of layers.
|
21 |
VBART-XLarge improves the results compared to VBART-Large albeit in small margins.
|
22 |
|
23 |
-
This repository contains fine-tuned TensorFlow and Safetensors weights of VBART for text paraphrasing task.
|
24 |
|
25 |
- **Developed by:** [VNGRS-AI](https://vngrs.com/ai/)
|
26 |
- **Model type:** Transformer encoder-decoder based on mBART architecture
|
@@ -51,7 +51,7 @@ The base model is pre-trained on [vngrs-web-corpus](https://huggingface.co/datas
|
|
51 |
The fine-tuning dataset is a mixture of [OpenSubtitles](https://huggingface.co/datasets/open_subtitles), [TED Talks (2013)](https://wit3.fbk.eu/home) and [Tatoeba](https://tatoeba.org/en/) datasets.
|
52 |
|
53 |
### Limitations
|
54 |
-
This model is fine-tuned for paraphrasing tasks. It is not intended to be used in any other case and can not be fine-tuned to any other task with full performance of the base model. It is also not guaranteed that this model will work without specified prompts.
|
55 |
|
56 |
### Training Procedure
|
57 |
Pre-trained for 30 days and for a total of 708B tokens. Finetuned for 25 epoch.
|
|
|
20 |
VBART-XLarge is created by adding extra Transformer layers between the layers of VBART-Large. Hence it was able to transfer learned weights from the smaller model while doublings its number of layers.
|
21 |
VBART-XLarge improves the results compared to VBART-Large albeit in small margins.
|
22 |
|
23 |
+
This repository contains fine-tuned TensorFlow and Safetensors weights of VBART for sentence-level text paraphrasing task.
|
24 |
|
25 |
- **Developed by:** [VNGRS-AI](https://vngrs.com/ai/)
|
26 |
- **Model type:** Transformer encoder-decoder based on mBART architecture
|
|
|
51 |
The fine-tuning dataset is a mixture of [OpenSubtitles](https://huggingface.co/datasets/open_subtitles), [TED Talks (2013)](https://wit3.fbk.eu/home) and [Tatoeba](https://tatoeba.org/en/) datasets.
|
52 |
|
53 |
### Limitations
|
54 |
+
This model is fine-tuned for paraphrasing tasks and finetuned in sentence level only. It is not intended to be used in any other case and can not be fine-tuned to any other task with full performance of the base model. It is also not guaranteed that this model will work without specified prompts.
|
55 |
|
56 |
### Training Procedure
|
57 |
Pre-trained for 30 days and for a total of 708B tokens. Finetuned for 25 epoch.
|