OpenLLM-France
/

Claire-7B-0.1

@@ -43,7 +43,7 @@ datasets:
 # Claire-7B-0.1
-**Claire-7B-0.1 is a 7B parameter causal decoder-only model built by [LINAGORA](https://labs.linagora.com/) and [OpenLLM-France](https://github.com/OpenLLM-France)**
 **adapted from [Falcon-7b](https://huggingface.co/tiiuae/falcon-7b) on French conversational data.**
 Quantized versions in GGUF format can be found in [TheBloke/Claire-7B-0.1-GGUF](https://huggingface.co/TheBloke/Claire-7B-0.1-GGUF).
@@ -229,6 +229,15 @@ Please note that the model can generate disfluencies and humorous responses as a
 More evaluation details will be provided in a separate publication.
 ## License
 Given that some of the corpora used for training are only available under CC-BY-NC-SA licenses,
@@ -236,14 +245,31 @@ Claire-7B-0.1 is made available under the [CC-BY-NC-SA 4.0 license](https://crea
 You can find a variant of this model published under the Apache 2.0 license at [OpenLLM-France/Claire-7B-Apache-0.1](https://huggingface.co/OpenLLM-France/Claire-7B-Apache-0.1).
 ## Acknowledgements
 This work was performed using HPC resources from GENCI–IDRIS (Grant 2023-AD011014561).
-Claire-7B-0.1 was created by members of [LINAGORA](https://labs.linagora.com/) (in alphabetical order): Ismaïl Harrando, Julie Hunter, Jean-Pierre Lorré, Jérôme Louradour, Michel-Marie Maudet, Virgile Rennard, Guokan Shang.
 Special thanks to partners from the OpenLLM-France community, especially Christophe Cerisara (LORIA), Pierre-Carl Langlais and Anastasia Stasenko (OpSci), and Pierre Colombo, for valuable advice.
 ## Contact
 contact@openllm-france.fr

 # Claire-7B-0.1
+**Claire-7B-0.1 is a 7B parameter causal decoder-only model built by [LINAGORA](https://labs.linagora.com/) with the support of [OpenLLM-France](https://github.com/OpenLLM-France)**
 **adapted from [Falcon-7b](https://huggingface.co/tiiuae/falcon-7b) on French conversational data.**
 Quantized versions in GGUF format can be found in [TheBloke/Claire-7B-0.1-GGUF](https://huggingface.co/TheBloke/Claire-7B-0.1-GGUF).
 More evaluation details will be provided in a separate publication.
+## Variants
+Claire-7B-0.1 is finetuned only on French dialogue data, but the following variants are available to evaluate the impact of language mixture on dialogue understanding.
+* [Claire-7B-FR-EN-25-75](OpenLLM-France/Claire-7B-FR-EN-25-75-0.1), with 25/75 French-English data split.
+* [Claire-7B-FR-EN-50-50](OpenLLM-France/Claire-7B-FR-EN-50-50-0.1), with 50/50 French-English data split.
+* [Claire-7B-FR-EN-75-25](OpenLLM-France/Claire-7B-FR-EN-75-25-0.1), with 75/25 French-English data split.
+* [Claire-7B](OpenLLM-France/Claire-7B-EN-0.1), with only English data.
 ## License
 Given that some of the corpora used for training are only available under CC-BY-NC-SA licenses,
 You can find a variant of this model published under the Apache 2.0 license at [OpenLLM-France/Claire-7B-Apache-0.1](https://huggingface.co/OpenLLM-France/Claire-7B-Apache-0.1).
+## Citation
+When using the Claire family of models, please cite the following paper:
+Jérôme Louradour, Julie Hunter, Ismaïl Harrando, Guokan Shang, Virgile Rennard & Jean-Pierre Lorré (2024). [Claire: Large Language Models for Spontaneous French Dialogue](https://aclanthology.org/2024.jeptalnrecital-taln.36.pdf). In _Actes de la 31ème Conférence sur le Traitement Automatique des Langues Naturelles, volume 1: articles longs et prises de position_ (pp. 530-548).
+```bibtex
+@inproceedings{louradour2024claire,
+  title={Claire: Large Language Models for Spontaneous French Dialogue},
+  author={Louradour, J{\'e}r{\^o}me and Hunter, Julie and Harrando, Isma{\"\i}l and Shang, Guokan and Rennard, Virgile and Lorr{\'e}, Jean-Pierre},
+  booktitle={Actes de la 31{\`e}me Conf{\'e}rence sur le Traitement Automatique des Langues Naturelles, volume 1: articles longs et prises de position},
+  pages={530--548},
+  year={2024}
+}
+```
 ## Acknowledgements
 This work was performed using HPC resources from GENCI–IDRIS (Grant 2023-AD011014561).
+Claire-7B-0.1 was created by members of [LINAGORA](https://labs.linagora.com/).
 Special thanks to partners from the OpenLLM-France community, especially Christophe Cerisara (LORIA), Pierre-Carl Langlais and Anastasia Stasenko (OpSci), and Pierre Colombo, for valuable advice.
 ## Contact
 contact@openllm-france.fr