Update README.md
Browse files
README.md
CHANGED
@@ -47,14 +47,16 @@ Si queréis incluir una versión de la Model Card en español, enlazadla aquí a
|
|
47 |
|
48 |
-->
|
49 |
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
|
|
|
|
58 |
|
59 |
## Model Details
|
60 |
|
@@ -93,8 +95,8 @@ Los resultados del desempeño del LLM se comparan con otros modelos del state-o
|
|
93 |
|
94 |
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
|
95 |
|
96 |
-
|
97 |
-
|
98 |
|
99 |
## Bias, Risks, and Limitations
|
100 |
|
@@ -162,7 +164,7 @@ Dataset used was [somosnlp/SMC/](https://huggingface.co/datasets/somosnlp/SMC/)
|
|
162 |
|
163 |
<!-- This should link to a Dataset Card. -->
|
164 |
|
165 |
-
|
166 |
|
167 |
#### Factors
|
168 |
|
@@ -199,8 +201,8 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
|
|
199 |
|
200 |
### Model Architecture and Objective
|
201 |
|
202 |
-
|
203 |
-
|
204 |
|
205 |
### Compute Infrastructure
|
206 |
|
@@ -224,8 +226,6 @@ Nvidia T4 Small 4 vCPU 15 GB RAM 16 GB VRAM
|
|
224 |
- accelerate
|
225 |
- datasets
|
226 |
|
227 |
-
[More Information Needed]
|
228 |
-
|
229 |
## License
|
230 |
|
231 |
<!-- Indicar bajo qué licencia se libera el modelo explicando, si no es apache 2.0, a qué se debe la licencia más restrictiva (i.e. herencia de las licencias del modelo pre-entrenado o de los datos utilizados). -->
|
@@ -271,13 +271,10 @@ Aquí tenéis un ejemplo de cita de un dataset que podéis adaptar:
|
|
271 |
|
272 |
<!-- Indicar aquí que el marco en el que se desarrolló el proyecto, en esta sección podéis incluir agradecimientos y más información sobre los miembros del equipo. Podéis adaptar el ejemplo a vuestro gusto. -->
|
273 |
|
274 |
-
Este proyecto fue desarrollado durante el [Hackathon #Somos600M](https://somosnlp.org/hackathon) organizado por SomosNLP.
|
275 |
-
El modelo fue entrenado usando GPU patrocinado por HuggingFace.
|
276 |
|
277 |
-
<!--
|
278 |
This project was developed during the [Hackathon #Somos600M](https://somosnlp.org/hackathon) organized by SomosNLP.
|
279 |
The model was trained using GPUs sponsored by HuggingFace.
|
280 |
-
|
281 |
|
282 |
**Team:**
|
283 |
|
@@ -293,6 +290,4 @@ El modelo fue entrenado usando GPU patrocinado por HuggingFace.
|
|
293 |
## Contact
|
294 |
|
295 |
<!-- Email de contacto para´posibles preguntas sobre el modelo. -->
|
296 |
-
|
297 |
-
|
298 |
-
Para cualquier duda contactar a: Dr.C Dionis López (inoid2007@gmail.com)
|
|
|
47 |
|
48 |
-->
|
49 |
|
50 |
+
More than 600 million Spanish-speaking people need resources, such as LLMs, to obtain medical information freely and safely,
|
51 |
+
complying with the millennium objectives: Health and Wellbeing, Education and Quality, End of Poverty proposed by the UN.
|
52 |
+
There are few LLMs for the medical domain in the Spanish language.
|
53 |
|
54 |
+
The objective of this project is to create a large language model (LLM) for the medical context in Spanish, allowing the creation of solutions
|
55 |
+
and health information services in LATAM. The model will have information on conventional, natural and traditional medicines.
|
56 |
+
An output of the project is a public dataset from the medical domain that pools resources from other sources that allows LLM to be created or fine-tuned.
|
57 |
+
The performance results of the LLM are compared with other state-of-the-art models such as BioMistral, Meditron, MedPalm.
|
58 |
+
|
59 |
+
[**Dataset Card in Spanish**](README_es.md)
|
60 |
|
61 |
## Model Details
|
62 |
|
|
|
95 |
|
96 |
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
|
97 |
|
98 |
+
The creators of LOL are not responsible for any harmful results they may generate. A rigorous evaluation process with specialists is suggested
|
99 |
+
of the results generated.
|
100 |
|
101 |
## Bias, Risks, and Limitations
|
102 |
|
|
|
164 |
|
165 |
<!-- This should link to a Dataset Card. -->
|
166 |
|
167 |
+
The corpus used was 20% [somosnlp/SMC/](https://huggingface.co/datasets/somosnlp/SMC/)
|
168 |
|
169 |
#### Factors
|
170 |
|
|
|
201 |
|
202 |
### Model Architecture and Objective
|
203 |
|
204 |
+
The architecture of [BioMistral/BioMistral-7B](https://huggingface.co/BioMistral/BioMistral-7B)because it is a foundational model
|
205 |
+
trained with a medical domain dataset.
|
206 |
|
207 |
### Compute Infrastructure
|
208 |
|
|
|
226 |
- accelerate
|
227 |
- datasets
|
228 |
|
|
|
|
|
229 |
## License
|
230 |
|
231 |
<!-- Indicar bajo qué licencia se libera el modelo explicando, si no es apache 2.0, a qué se debe la licencia más restrictiva (i.e. herencia de las licencias del modelo pre-entrenado o de los datos utilizados). -->
|
|
|
271 |
|
272 |
<!-- Indicar aquí que el marco en el que se desarrolló el proyecto, en esta sección podéis incluir agradecimientos y más información sobre los miembros del equipo. Podéis adaptar el ejemplo a vuestro gusto. -->
|
273 |
|
|
|
|
|
274 |
|
|
|
275 |
This project was developed during the [Hackathon #Somos600M](https://somosnlp.org/hackathon) organized by SomosNLP.
|
276 |
The model was trained using GPUs sponsored by HuggingFace.
|
277 |
+
|
278 |
|
279 |
**Team:**
|
280 |
|
|
|
290 |
## Contact
|
291 |
|
292 |
<!-- Email de contacto para´posibles preguntas sobre el modelo. -->
|
293 |
+
For any doubt or suggestion contact to: PhD Dionis López (inoid2007@gmail.com)
|
|
|
|