File size: 1,976 Bytes

c46dcc1
c64408c
e54c997
 
 
 
 
 
 
 
6e1b982
 
 
e54c997
831b031
 
 
b4bed91
c46dcc1
e54c997
 
 
 
 
 
 
 
 
 
 
 
6d63051
 
 
e54c997
 
 
 
 
 
 
6e1b982
fcc7cd5
 
e54c997
6e1b982
ff4f8f2
6e1b982
 
 
e54c997
 
 
 
6e1b982
6d932a0
6e1b982
 
6d932a0
 
 
6e1b982
 
e54c997

---
base_model: google/mt5-small
license: apache-2.0
datasets:
- opus_books
- iwslt2017
language:
- en
- nl
metrics:
  - bleu
  - chrf
  - chrf++
pipeline_tag: text2text-generation
tags:
- translation
widget:
- text: ">>nl<< Hello, what are you doing?"
---

# Model Card for mt5-small en-nl translation

The mt5-small en-nl translation model is a finetuned version of [google/mt5-small](https://huggingface.co/google/mt5-small).

It was finetuned on 237k rows of the [iwslt2017](https://huggingface.co/datasets/iwslt2017/viewer/iwslt2017-en-nl) dataset and roughly 38k rows of the [opus_books](https://huggingface.co/datasets/opus_books/viewer/en-nl) dataset. The model was trained for 15 epochs with a batchsize of 16.


## How to use

**Install dependencies**
```bash
pip install transformers
pip install sentencepiece
pip install protobuf
```

You can use the following code for model inference. This model was finetuned to work with an identifier when prompted that needs to be present for the best results.

```Python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, GenerationConfig

# load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Michielo/mt5-small_en-nl_translation")
model = AutoModelForSeq2SeqLM.from_pretrained("Michielo/mt5-small_en-nl_translation")

# tokenize input
inputs = tokenizer(">>nl<< Your English text here", return_tensors="pt")
# calculate the output
outputs = model.generate(**inputs)
# decode and print
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
```


## Benchmarks
You can replicate our benchmark scores [here](https://github.com/AssistantsLab/AssistantsLab-Replication/tree/main/evaluation) without writing any code yourself.
| Benchmark    | Score |
|--------------|:-----:|
| BLEU    | 43.63% |
| chr-F | 62.25% |
| chr-F++   | 61.87% |


## License
This project is licensed under the Apache License 2.0 - see the [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) file for details.