Update README.md
Browse files
README.md
CHANGED
@@ -28,8 +28,7 @@ pipeline_tag: translation
|
|
28 |
- [Training](#training)
|
29 |
- [Evaluation](#evaluation)
|
30 |
- [Citation](#citation)
|
31 |
-
- [
|
32 |
-
- [Contact](#contact)
|
33 |
|
34 |
</details>
|
35 |
|
@@ -37,7 +36,7 @@ pipeline_tag: translation
|
|
37 |
|
38 |
Plume is the first LLM trained for Neural Machine Translation with only parallel Catalan-Centric data from scratch. It is a language model with the same architecture as Gemma 2B. The model is trained for general translation tasks at sentence level. For more information about training, architecture and interpretability of the model check out the paper; "Investigating the translation capabilities of Large Language Models trained on parallel data only". The preprint is available on [arXiv]().
|
39 |
|
40 |
-
- **Developed by:**
|
41 |
- **Languages:** Spanish, French, Italian, Portuguese, Galician, German, English, and Basque.
|
42 |
- **License:** Apache License, Version 2.0
|
43 |
|
@@ -79,7 +78,7 @@ generated_text = tokenizer.decode(output_ids[0, input_length: ], skip_special_to
|
|
79 |
|
80 |
## Training
|
81 |
|
82 |
-
Training details are specified in the [paper](). Code for training the model and running other experiments can be found in
|
83 |
|
84 |
## Evaluation
|
85 |
|
@@ -101,15 +100,36 @@ Below are the evaluation results on Flores-200 and NTREX for supervised MT direc
|
|
101 |
|
102 |
```
|
103 |
|
104 |
-
##
|
105 |
|
106 |
-
|
|
|
107 |
|
108 |
-
|
|
|
109 |
|
110 |
-
|
|
|
111 |
|
|
|
|
|
112 |
|
113 |
-
|
|
|
114 |
|
115 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
- [Training](#training)
|
29 |
- [Evaluation](#evaluation)
|
30 |
- [Citation](#citation)
|
31 |
+
- [Additional information](#additional-information)
|
|
|
32 |
|
33 |
</details>
|
34 |
|
|
|
36 |
|
37 |
Plume is the first LLM trained for Neural Machine Translation with only parallel Catalan-Centric data from scratch. It is a language model with the same architecture as Gemma 2B. The model is trained for general translation tasks at sentence level. For more information about training, architecture and interpretability of the model check out the paper; "Investigating the translation capabilities of Large Language Models trained on parallel data only". The preprint is available on [arXiv]().
|
38 |
|
39 |
+
- **Developed by:** The Language Technologies Unit from Barcelona Supercomputing Center (BSC).
|
40 |
- **Languages:** Spanish, French, Italian, Portuguese, Galician, German, English, and Basque.
|
41 |
- **License:** Apache License, Version 2.0
|
42 |
|
|
|
78 |
|
79 |
## Training
|
80 |
|
81 |
+
Training details are specified in the [paper](). Code for training the model and running other experiments can be found in our [GitHub repository](https://github.com/projecte-aina/Plume).
|
82 |
|
83 |
## Evaluation
|
84 |
|
|
|
100 |
|
101 |
```
|
102 |
|
103 |
+
## Additional information
|
104 |
|
105 |
+
### Author
|
106 |
+
The Language Technologies Unit from Barcelona Supercomputing Center.
|
107 |
|
108 |
+
### Contact
|
109 |
+
Feel free to write us at with any questions you may have to {javier.garcia1, carlos.escolano, aleix.santsavall, francesca.delucafornaciari, audrey.mash, xixian.liao, maite.melero}@bsc.es
|
110 |
|
111 |
+
### Copyright
|
112 |
+
Copyright(c) 2023 by Language Technologies Unit, Barcelona Supercomputing Center.
|
113 |
|
114 |
+
### License
|
115 |
+
[Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
116 |
|
117 |
+
### Funding
|
118 |
+
This work was funded by [Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya](https://politiquesdigitals.gencat.cat/ca/inici/index.html#googtrans(ca|en) within the framework of [Projecte AINA](https://politiquesdigitals.gencat.cat/ca/economia/catalonia-ai/aina).
|
119 |
|
120 |
+
### Disclaimer
|
121 |
+
|
122 |
+
<details>
|
123 |
+
<summary>Click to expand</summary>
|
124 |
+
|
125 |
+
The model published in this repository is intended for a generalist purpose and is available to third parties under a permissive Apache License, Version 2.0.
|
126 |
+
|
127 |
+
Be aware that the model may have biases and/or any other undesirable distortions.
|
128 |
+
|
129 |
+
When third parties deploy or provide systems and/or services to other parties using this model (or any system based on it)
|
130 |
+
or become users of the model, they should note that it is their responsibility to mitigate the risks arising from its use and,
|
131 |
+
in any event, to comply with applicable regulations, including regulations regarding the use of Artificial Intelligence.
|
132 |
+
|
133 |
+
In no event shall the owner and creator of the model (Barcelona Supercomputing Center) be liable for any results arising from the use made by third parties.
|
134 |
+
|
135 |
+
</details>
|