---
license: apache-2.0
datasets:
- HiTZ/CONAN-EUS
language:
- en
metrics:
- bleu
library_name: transformers
pipeline_tag: text2text-generation
tags:
- counternarrative
- hate speech
- text generation
---
**Content Warning**: This card may contain examples of offensive language that do not reflect the authors’ views
# Model Card for mT5-counternarrative-en
This is a fine-tuned text-to-text [mT5-base](https://huggingface.co/google/mt5-base) model to generate counternarratives against hate speech.
The model has been fine-tuned on the [CONAN-EUS](https://huggingface.co/datasets/HiTZ/CONAN-EUS) splits of the
original CONAN dataset.
The CONAN (COunter NArratives through Nichesourcing) dataset was published by [Chung et al., 2019](https://aclanthology.org/P19-1271.pdf)
and is publicly available in [https://github.com/marcoguerini/CONAN](https://github.com/marcoguerini/CONAN).
CONAN-EUS was created by professionally translating all 6654 English HS-CN pairs of the original CONAN dataset into
**Basque and Spanish**. For experimentation we generated train, validation and test splits in a way that no HS-CN pairs occurred across them.
CONAN-EUS Splits |
Total HS-CN Count |
train |
4833 |
validation |
537 |
test |
1278 |
- 📖 Paper: [Basque and Spanish Counter Narrative Generation: Data Creation and Evaluation](https://arxiv.org/abs/2403.09159) In LREC-COLING 2024.
- 💻 Github Repo (Data and Code): [https://github.com/ixa-ehu/conan-e/](https://github.com/ixa-ehu/conan-e/)
## HS-CN example
The CONAN dataset includes HS-CN pairs, regarding the topic of Muslims and Islam.
An example of a HS-CN pair in Basque, Spanish and English is illustrated below:
| HS | CN |
|-------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|
| Musulmanok ez dugu gure kultura aberastu dezakeen ezer erabilgarririk. | Zer egin dute guretzat musulmanek? Beno, kafea, erlojuak, kamerak, higienea, unibertsitateak, tresna kirurgikoak, mapak, musika, aljebra. |
| Los musulmanes no tienen nada útil que pueda enriquecer nuestra cultura.| ¿Qué han hecho los musulmanes por nosotros? Bueno, café, relojes, cámaras, higiene, universidades, instrumentos quirúrgicos, mapas, música, álgebra.|
| Muslims do not have anything useful that can enrich our culture.| What have Muslims ever done for us? Well, Coffee, Clocks, Cameras, Hygiene, Universities, Surgical Instruments, Maps, Music, Algebra.|
If you use the model please **cite these two papers**:
## Citation
```bibtex
@inproceedings{bengoetxea-et-al-2024,
title={{B}asque and {S}panish {C}ounter {N}arrative {G}eneration: {D}ata {C}reation and {E}valuation},
author={Jaione Bengoetxea and Yi-Ling Chung and Marco Guerini and Rodrigo Agerri},
year={2024},
publisher = "Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)",
}
```
```bibtex
@inproceedings{chung-etal-2019-conan,
title = "{CONAN} - {CO}unter {NA}rratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech",
author = "Chung, Yi-Ling and
Kuzmenko, Elizaveta and
Tekiroglu, Serra Sinem and
Guerini, Marco",
booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
year = "2019",
pages = "2819--2829"
}
```
**Contact**: [Rodrigo Agerri](https://ragerri.github.io/)
HiTZ Center - Ixa, University of the Basque Country UPV/EHU