Update README.md
Browse files
README.md
CHANGED
@@ -34,16 +34,12 @@ See the snippet below for usage with Transformers:
|
|
34 |
>>> pipeline("Hey how are you doing today?")
|
35 |
```
|
36 |
|
37 |
-
## Training
|
38 |
|
39 |
-
`AI-Sweden-Models/Llama-3-8B` was trained on a subset from [The nordic pile](https://arxiv.org/abs/2303.17183)
|
40 |
|
41 |
-
|
42 |
-
|
43 |
-
**Training Factors** We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute.
|
44 |
|
45 |
## Benchmarks
|
46 |
|
47 |
-
Coming soon.
|
48 |
-
|
49 |
-
<iframe src="https://wandb.ai/nlu-group/llama3-nordic-pile-1/workspace?nw=nwusertimpal0l" style="border:none;height:1024px;width:100%">
|
|
|
34 |
>>> pipeline("Hey how are you doing today?")
|
35 |
```
|
36 |
|
37 |
+
## Training information
|
38 |
|
39 |
+
`AI-Sweden-Models/Llama-3-8B` is a continuation of the pretraining process from `meta-llama/Meta-Llama-3-8B`. It was trained on a subset from [The nordic pile](https://arxiv.org/abs/2303.17183) containing Swedish, Norweigian and Danish.
|
40 |
|
41 |
+
A total of 92 A100 gpus was used, and roughly 250GB of data was used.
|
|
|
|
|
42 |
|
43 |
## Benchmarks
|
44 |
|
45 |
+
Coming soon.
|
|
|
|