LumiOpen
/

Poro-34B

@@ -12,7 +12,7 @@ datasets:
 # Poro 34B Model Card
-_**NOTE:** This is a **research checkpoint** of a model for which **training has not been completed.** It is being provided in its current state for research and testing purposes. **Care should be taken when using the outputs of the model.** Once pretraining has completed we intend to release a version with an extended context window as well as additional instruction-tuned and chat-tuned varieties._
 Poro is a 34B parameter decoder-only transformer pretrained on Finnish, English and code.  It is being trained on 1 trillion tokens (300 billion as of this release). Poro is a fully open source model and is made available under the Apache 2.0 License.
@@ -27,7 +27,7 @@ _What does Poro mean?_ Poro is the Finnish word for Reindeer! 🦌 These animals
 ## Model Overview
 _**NOTE:** In addition to being an early research release, Poro is a base model which needs further fine tuning for most use cases._
-Poro is a generative pretrained transformer using a BLOOM architecture, and makes use of ALiBi embeddings to support context length extrapolation.
 | Hyperparameter | Value  |
 | :------------- | :----: |

 # Poro 34B Model Card
+_**NOTE:** This is a **research checkpoint** of a model for which **training has not been completed.** It is being provided in its current state for research and testing purposes. **Care should be taken when using the outputs of the model.** Once pretraining has completed we intend to release additional instruction-tuned and chat-tuned varieties._
 Poro is a 34B parameter decoder-only transformer pretrained on Finnish, English and code.  It is being trained on 1 trillion tokens (300 billion as of this release). Poro is a fully open source model and is made available under the Apache 2.0 License.
 ## Model Overview
 _**NOTE:** In addition to being an early research release, Poro is a base model which needs further fine tuning for most use cases._
+Poro is a generative pretrained transformer using a BLOOM architecture, and makes use of ALiBi embeddings to support context length extrapolation at inference time.
 | Hyperparameter | Value  |
 | :------------- | :----: |