Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,27 @@ language:
|
|
7 |
- en
|
8 |
tags:
|
9 |
- detoxification
|
10 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
- en
|
8 |
tags:
|
9 |
- detoxification
|
10 |
+
---
|
11 |
+
|
12 |
+
**Model Overview**
|
13 |
+
|
14 |
+
This is the model presented in the paper ["MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages"](https://arxiv.org/pdf/2404.02037).
|
15 |
+
It is [facebook/mbart-large-50](https://huggingface.co/facebook/mbart-large-50) fine-tuned on the parallel detoxification dataset of RUssian, English, Ukrainian, and Spanish.
|
16 |
+
|
17 |
+
**How to use**
|
18 |
+
```python
|
19 |
+
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
20 |
+
model_name = 'textdetox/mBART_paradetox_rus_ukr_esp_en'
|
21 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
22 |
+
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
|
23 |
+
```
|
24 |
+
|
25 |
+
**Citation**
|
26 |
+
```
|
27 |
+
@article{dementieva2024multiparadetox,
|
28 |
+
title={MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages},
|
29 |
+
author={Dementieva, Daryna and Babakov, Nikolay and Panchenko, Alexander},
|
30 |
+
journal={arXiv preprint arXiv:2404.02037},
|
31 |
+
year={2024}
|
32 |
+
}
|
33 |
+
```
|