AI-Sweden-Models
/

Llama-3-8B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

timpal0l commited on May 13, 2024

Commit

ad12092

•

1 Parent(s): 8ab3932

Update README.md

Files changed (1) hide show

README.md +4 -8

README.md CHANGED Viewed

@@ -34,16 +34,12 @@ See the snippet below for usage with Transformers:
 >>> pipeline("Hey how are you doing today?")
 ```
-## Training Data
-`AI-Sweden-Models/Llama-3-8B` was trained on a subset from [The nordic pile](https://arxiv.org/abs/2303.17183)
-## Hardware and Software
-**Training Factors** We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine-tuning, annotation, and evaluation were also performed on third-party cloud compute.
 ## Benchmarks
-Coming soon.
-<iframe src="https://wandb.ai/nlu-group/llama3-nordic-pile-1/workspace?nw=nwusertimpal0l" style="border:none;height:1024px;width:100%">

 >>> pipeline("Hey how are you doing today?")
 ```
+## Training information
+`AI-Sweden-Models/Llama-3-8B` is a continuation of the pretraining process from `meta-llama/Meta-Llama-3-8B`. It was trained on a subset from [The nordic pile](https://arxiv.org/abs/2303.17183) containing Swedish, Norweigian and Danish.
+A total of 92 A100 gpus was used, and roughly 250GB of data was used.
 ## Benchmarks
+Coming soon.