jonabur commited on Dec 14, 2023

Commit

9f3d465

•

1 Parent(s): dc0e31c

add 700B checkpoint

Browse files

Files changed (17) hide show

README.md +2 -1
config.json +2 -2
generation_config.json +1 -1
model-00001-of-00014.safetensors +1 -1
model-00002-of-00014.safetensors +1 -1
model-00003-of-00014.safetensors +1 -1
model-00004-of-00014.safetensors +1 -1
model-00005-of-00014.safetensors +1 -1
model-00006-of-00014.safetensors +1 -1
model-00007-of-00014.safetensors +1 -1
model-00008-of-00014.safetensors +1 -1
model-00009-of-00014.safetensors +1 -1
model-00010-of-00014.safetensors +1 -1
model-00011-of-00014.safetensors +1 -1
model-00012-of-00014.safetensors +1 -1
model-00013-of-00014.safetensors +1 -1
model-00014-of-00014.safetensors +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ datasets:
 _**NOTE:** This is a **research checkpoint** of a model for which **training has not been completed.** It is being provided in its current state for research and testing purposes. **Care should be taken when using the outputs of the model.** Once pretraining has completed we intend to release additional instruction-tuned and chat-tuned varieties._
-Poro is a 34B parameter decoder-only transformer pretrained on Finnish, English and code.  It is being trained on 1 trillion tokens (600 billion as of this release). Poro is a fully open source model and is made available under the Apache 2.0 License.
 Poro was created in a collaboration between [SiloGen](https://www.silo.ai/silogen) from [Silo AI](https://www.silo.ai/), the [TurkuNLP group](https://turkunlp.org/) of the University of Turku, and [High Performance Language Technologies](https://hplt-project.org/) (HPLT). Training was conducted on the [LUMI supercomputer](https://www.lumi-supercomputer.eu/), using compute resources generously provided by [CSC](https://csc.fi/) - IT Center for Science, Finland.
@@ -48,6 +48,7 @@ Checkpoints are available as branches in the repository.  Checkpoints will be re
 * [400B](https://huggingface.co/LumiOpen/Poro-34B/tree/400B)
 * [500B](https://huggingface.co/LumiOpen/Poro-34B/tree/500B)
 * [600B](https://huggingface.co/LumiOpen/Poro-34B/tree/600B)
 The transformers library allows you to load a checkpoint from a branch as follows:

 _**NOTE:** This is a **research checkpoint** of a model for which **training has not been completed.** It is being provided in its current state for research and testing purposes. **Care should be taken when using the outputs of the model.** Once pretraining has completed we intend to release additional instruction-tuned and chat-tuned varieties._
+Poro is a 34B parameter decoder-only transformer pretrained on Finnish, English and code.  It is being trained on 1 trillion tokens (700 billion as of this release). Poro is a fully open source model and is made available under the Apache 2.0 License.
 Poro was created in a collaboration between [SiloGen](https://www.silo.ai/silogen) from [Silo AI](https://www.silo.ai/), the [TurkuNLP group](https://turkunlp.org/) of the University of Turku, and [High Performance Language Technologies](https://hplt-project.org/) (HPLT). Training was conducted on the [LUMI supercomputer](https://www.lumi-supercomputer.eu/), using compute resources generously provided by [CSC](https://csc.fi/) - IT Center for Science, Finland.
 * [400B](https://huggingface.co/LumiOpen/Poro-34B/tree/400B)
 * [500B](https://huggingface.co/LumiOpen/Poro-34B/tree/500B)
 * [600B](https://huggingface.co/LumiOpen/Poro-34B/tree/600B)
+* [700B](https://huggingface.co/LumiOpen/Poro-34B/tree/700B)
 The transformers library allows you to load a checkpoint from a branch as follows:

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "/scratch/project_462000319/general-tools/checkpoints/33B_torch_step143712_bfloat16",
   "apply_residual_connection_post_layernorm": false,
   "architectures": [
     "BloomForCausalLM"
@@ -20,7 +20,7 @@
   "pretraining_tp": 2,
   "slow_but_exact": false,
   "torch_dtype": "bfloat16",
-  "transformers_version": "4.35.0",
   "use_cache": true,
   "vocab_size": 128000
 }

 {
+  "_name_or_path": "/scratch/project_462000319/general-tools/checkpoints/33B_torch_step166752_bfloat16",
   "apply_residual_connection_post_layernorm": false,
   "architectures": [
     "BloomForCausalLM"
   "pretraining_tp": 2,
   "slow_but_exact": false,
   "torch_dtype": "bfloat16",
+  "transformers_version": "4.36.0",
   "use_cache": true,
   "vocab_size": 128000
 }

generation_config.json CHANGED Viewed

@@ -3,5 +3,5 @@
   "bos_token_id": 1,
   "eos_token_id": 2,
   "pad_token_id": 3,
-  "transformers_version": "4.35.0"
 }

   "bos_token_id": 1,
   "eos_token_id": 2,
   "pad_token_id": 3,
+  "transformers_version": "4.36.0"
 }

model-00001-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:122e8a9baec708629e3476dd789bd16a669ceb461c4811af61f8238179892a2c
 size 4712820784

 version https://git-lfs.github.com/spec/v1
+oid sha256:ad4beef808771cae6edf6f14579dfbff9a40c655f61bb5e8d8575b93c714042a
 size 4712820784

model-00002-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2689231237b8bbc0b7d1217690de3d90f859966c6fbb881b0827ee38a79ffdb7
 size 4933252680

 version https://git-lfs.github.com/spec/v1
+oid sha256:5382b95e1b9991b956fced7510623a59667404af1086b2d34f1e8237e61580e0
 size 4933252680

model-00003-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c1db3f74bdf45dd015aff31edc083d1c7b740a2d18d51524ced786f5f81dfa19
 size 4933252648

 version https://git-lfs.github.com/spec/v1
+oid sha256:f64b5a0c8cc2414f9f01f765957db059ed1a81ec0d1887a8ab01c36036641c16
 size 4933252648

model-00004-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b006e28ebe97c8416e90d552838ea6e0bbfae1cdaa9de6ab3287509e0d981fbb
 size 4933252728

 version https://git-lfs.github.com/spec/v1
+oid sha256:2612ffbe340c98a35f0f7b8e32a90e3fb35c276ae363d97690b29655b0c0f631
 size 4933252728

model-00005-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0486a7f51f2e5a83383c4f436082e6b4e0320642ddda25aaa791b5030e3e4e70
 size 4933252728

 version https://git-lfs.github.com/spec/v1
+oid sha256:ec36fa7fe18e7e3c115cf1d8b6a52ae93fa9ed8aae675f87b5ed32725b4a340a
 size 4933252728

model-00006-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1aec6faf7cbeb1a9755844e1e6d888c56e51f9392d7f49110e68c44a4c708383
 size 4933252728

 version https://git-lfs.github.com/spec/v1
+oid sha256:ac4ee77d6ac1264c542039cc432f99786a0fcae796a9f6ec9178a507b77dba86
 size 4933252728

model-00007-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a5f45a9259e64aad76dbe7e4c586bf6575b54faf0f900fff8a51f17439aed691
 size 4933252728

 version https://git-lfs.github.com/spec/v1
+oid sha256:129e11b03118922eb1b24085b08e85af0cd2e73ad6b0e6b1ca8824bb583e3de8
 size 4933252728

model-00008-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:caa0d3ced1171735eec853bfd05212c5778d7157a89fbfef4f43c1bb353c18d4
 size 4933252728

 version https://git-lfs.github.com/spec/v1
+oid sha256:30cddd448da7da54664abc79ab3d6001de64b586e07d6641b2c5923a950be538
 size 4933252728

model-00009-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6bce6dd88d67171b89babd472dda1a3e5bc3af8ef0b0714cb96a11090bb28df7
 size 4933252728

 version https://git-lfs.github.com/spec/v1
+oid sha256:05e7b1db555170ecf0095cc25cdcd3d5d683d42c5dbb444286c226af1e8750ea
 size 4933252728

model-00010-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cc28d2be3e6ae211f747c1b9838e357f273b8944b7f3c6e08472546607383554
 size 4933252728

 version https://git-lfs.github.com/spec/v1
+oid sha256:2d901daa5518111d19ad3b34fe83480fce97bc33c23cd1414e89dcee1a6a0883
 size 4933252728

model-00011-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0fb88920dd5b872dadc35b0b0dfceb7196bc00c1e5f29d463a932de912ab9a58
 size 4933252728

 version https://git-lfs.github.com/spec/v1
+oid sha256:3c89c4ba9550cee307b3614361f04dc7c06d10e0b9d9b74e4eade02bfb673768
 size 4933252728

model-00012-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4bf10456a80691d04b4d9734d7adf3c62437b15720c61023d76d013b77f0f6cc
 size 4933252728

 version https://git-lfs.github.com/spec/v1
+oid sha256:fdf43f8744a3f9a1d4d85d2c4912cc7c8eb8857fa1896319df0dfe75ffdf8166
 size 4933252728

model-00013-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:792d81f54fc1398d38e7cb415b3ee96c08944b12dc7cec79ea9e8b49546581e6
 size 4933252728

 version https://git-lfs.github.com/spec/v1
+oid sha256:aba623da877b94b34c6194db40b4ef1ba97331cc37e23c72e6a9cdfca4d8b547
 size 4933252728

model-00014-of-00014.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:76bdca2059826a6775c40650558217028f37ad640eef2039a952eac7ba613baa
 size 4522124144

 version https://git-lfs.github.com/spec/v1
+oid sha256:c3ddfdb7ce089a24e81758eddf768a7057ba3bd6dc74c33182414b3ee640d103
 size 4522124144