|
--- |
|
base_model: google/mt5-small |
|
license: apache-2.0 |
|
datasets: |
|
- opus_books |
|
- iwslt2017 |
|
language: |
|
- en |
|
- nl |
|
metrics: |
|
- bleu |
|
- chrf |
|
- chrf++ |
|
pipeline_tag: text2text-generation |
|
tags: |
|
- translation |
|
widget: |
|
- text: ">>nl<< Hello, what are you doing?" |
|
--- |
|
|
|
# Model Card for mt5-small en-nl translation |
|
|
|
The mt5-small en-nl translation model is a finetuned version of [google/mt5-small](https://huggingface.co/google/mt5-small). |
|
|
|
It was finetuned on 237k rows of the [iwslt2017](https://huggingface.co/datasets/iwslt2017/viewer/iwslt2017-en-nl) dataset and roughly 38k rows of the [opus_books](https://huggingface.co/datasets/opus_books/viewer/en-nl) dataset. The model was trained for 15 epochs with a batchsize of 16. |
|
|
|
|
|
## How to use |
|
|
|
**Install dependencies** |
|
```bash |
|
pip install transformers |
|
pip install sentencepiece |
|
pip install protobuf |
|
``` |
|
|
|
You can use the following code for model inference. This model was finetuned to work with an identifier when prompted that needs to be present for the best results. |
|
|
|
```Python |
|
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, GenerationConfig |
|
|
|
# load tokenizer and model |
|
tokenizer = AutoTokenizer.from_pretrained("Michielo/mt5-small_en-nl_translation") |
|
model = AutoModelForSeq2SeqLM.from_pretrained("Michielo/mt5-small_en-nl_translation") |
|
|
|
# tokenize input |
|
inputs = tokenizer(">>nl<< Your English text here", return_tensors="pt") |
|
# calculate the output |
|
outputs = model.generate(**inputs) |
|
# decode and print |
|
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)) |
|
``` |
|
|
|
|
|
## Benchmarks |
|
You can replicate our benchmark scores [here](https://github.com/AssistantsLab/AssistantsLab-Replication/tree/main/evaluation) without writing any code yourself. |
|
| Benchmark | Score | |
|
|--------------|:-----:| |
|
| BLEU | 43.63% | |
|
| chr-F | 62.25% | |
|
| chr-F++ | 61.87% | |
|
|
|
|
|
## License |
|
This project is licensed under the Apache License 2.0 - see the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) file for details. |