mmoreirast
commited on
Commit
•
d196802
1
Parent(s):
a2dd724
Update README.md
Browse files
README.md
CHANGED
@@ -29,13 +29,13 @@ Mariana Moreira dos Santos ([LinkedIn](https://www.linkedin.com/in/mmoreirast/))
|
|
29 |
You can check the codes used to fine-tune the model at the following [Google Colab](https://colab.research.google.com/drive/1SvJvTcH3IRnsEv72UxkVmV0oClCZARtE?usp=sharing) link.
|
30 |
|
31 |
## Fine-tuning details
|
32 |
-
- **Base model:** [TeenyTinyLlama
|
33 |
- **Context length:** 2048 tokens
|
34 |
- **Dataset for fine-tuning:** [medicine-training-pt](mmoreirast/medicine-training-pt)
|
35 |
- **Dataset for evaluation:** [medicine-evaluation-pt](https://huggingface.co/datasets/mmoreirast/medicine-evaluation-pt)
|
36 |
- **Language:** Portuguese
|
37 |
-
- **GPU:** NVIDIA
|
38 |
-
- **Training time**: ~
|
39 |
|
40 |
## Parameters
|
41 |
- **Number of Epochs:** 4
|
@@ -61,7 +61,7 @@ Using the `pipeline`:
|
|
61 |
```python
|
62 |
from transformers import pipeline
|
63 |
|
64 |
-
generator = pipeline("text-generation", model="mmoreirast/Doctor-Llama-
|
65 |
|
66 |
completions = generator("Me fale sobre o sistema nervoso", num_return_sequences=2, max_new_tokens=100)
|
67 |
|
@@ -76,8 +76,8 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
76 |
import torch
|
77 |
|
78 |
# Load model and the tokenizer
|
79 |
-
tokenizer = AutoTokenizer.from_pretrained("mmoreirast/Doctor-Llama-
|
80 |
-
model = AutoModelForCausalLM.from_pretrained("mmoreirast/Doctor-Llama-
|
81 |
|
82 |
# Pass the model to your device
|
83 |
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
|
|
29 |
You can check the codes used to fine-tune the model at the following [Google Colab](https://colab.research.google.com/drive/1SvJvTcH3IRnsEv72UxkVmV0oClCZARtE?usp=sharing) link.
|
30 |
|
31 |
## Fine-tuning details
|
32 |
+
- **Base model:** [TeenyTinyLlama 160m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-160m)
|
33 |
- **Context length:** 2048 tokens
|
34 |
- **Dataset for fine-tuning:** [medicine-training-pt](mmoreirast/medicine-training-pt)
|
35 |
- **Dataset for evaluation:** [medicine-evaluation-pt](https://huggingface.co/datasets/mmoreirast/medicine-evaluation-pt)
|
36 |
- **Language:** Portuguese
|
37 |
+
- **GPU:** NVIDIA L4
|
38 |
+
- **Training time**: ~9 hours
|
39 |
|
40 |
## Parameters
|
41 |
- **Number of Epochs:** 4
|
|
|
61 |
```python
|
62 |
from transformers import pipeline
|
63 |
|
64 |
+
generator = pipeline("text-generation", model="mmoreirast/Doctor-Llama-160m")
|
65 |
|
66 |
completions = generator("Me fale sobre o sistema nervoso", num_return_sequences=2, max_new_tokens=100)
|
67 |
|
|
|
76 |
import torch
|
77 |
|
78 |
# Load model and the tokenizer
|
79 |
+
tokenizer = AutoTokenizer.from_pretrained("mmoreirast/Doctor-Llama-160m", revision='main')
|
80 |
+
model = AutoModelForCausalLM.from_pretrained("mmoreirast/Doctor-Llama-160m", revision='main')
|
81 |
|
82 |
# Pass the model to your device
|
83 |
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|