jonabur
commited on
Commit
•
8502bcf
1
Parent(s):
7733eca
improve descriptions
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ datasets:
|
|
12 |
|
13 |
# Poro 34B Model Card
|
14 |
|
15 |
-
_**NOTE:** This is a **research checkpoint** of a model for which **training has not been completed.** It is being provided in its current state for research and testing purposes. **Care should be taken when using the outputs of the model.** Once pretraining has completed we intend to release
|
16 |
|
17 |
Poro is a 34B parameter decoder-only transformer pretrained on Finnish, English and code. It is being trained on 1 trillion tokens (300 billion as of this release). Poro is a fully open source model and is made available under the Apache 2.0 License.
|
18 |
|
@@ -27,7 +27,7 @@ _What does Poro mean?_ Poro is the Finnish word for Reindeer! 🦌 These animals
|
|
27 |
## Model Overview
|
28 |
_**NOTE:** In addition to being an early research release, Poro is a base model which needs further fine tuning for most use cases._
|
29 |
|
30 |
-
Poro is a generative pretrained transformer using a BLOOM architecture, and makes use of ALiBi embeddings to support context length extrapolation.
|
31 |
|
32 |
| Hyperparameter | Value |
|
33 |
| :------------- | :----: |
|
|
|
12 |
|
13 |
# Poro 34B Model Card
|
14 |
|
15 |
+
_**NOTE:** This is a **research checkpoint** of a model for which **training has not been completed.** It is being provided in its current state for research and testing purposes. **Care should be taken when using the outputs of the model.** Once pretraining has completed we intend to release additional instruction-tuned and chat-tuned varieties._
|
16 |
|
17 |
Poro is a 34B parameter decoder-only transformer pretrained on Finnish, English and code. It is being trained on 1 trillion tokens (300 billion as of this release). Poro is a fully open source model and is made available under the Apache 2.0 License.
|
18 |
|
|
|
27 |
## Model Overview
|
28 |
_**NOTE:** In addition to being an early research release, Poro is a base model which needs further fine tuning for most use cases._
|
29 |
|
30 |
+
Poro is a generative pretrained transformer using a BLOOM architecture, and makes use of ALiBi embeddings to support context length extrapolation at inference time.
|
31 |
|
32 |
| Hyperparameter | Value |
|
33 |
| :------------- | :----: |
|