Teja-Gollapudi
commited on
Commit
•
efe2d41
1
Parent(s):
bc1fe6c
Update README.md
Browse files
README.md
CHANGED
@@ -11,18 +11,17 @@ pipeline_tag: conversational
|
|
11 |
# VMware/open-llama-0.3T-7B-open-instruct-v1.1
|
12 |
|
13 |
## License
|
14 |
-
<
|
15 |
-
|
16 |
-
|
17 |
-
</ul>
|
18 |
|
19 |
## Nomenclature
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
|
27 |
## Use in Transformers
|
28 |
|
@@ -65,10 +64,10 @@ This way, the model can better understand the relationship between different par
|
|
65 |
```
|
66 |
|
67 |
## Drawbacks
|
68 |
-
|
69 |
-
|
70 |
-
|
71 |
-
|
72 |
|
73 |
## Evaluation
|
74 |
|
|
|
11 |
# VMware/open-llama-0.3T-7B-open-instruct-v1.1
|
12 |
|
13 |
## License
|
14 |
+
- <b>Commerically viable</b>
|
15 |
+
- The instruction dataset, [VMware/open-instruct-v1.1-oasst-dolly-hhrlhf](https://huggingface.co/datasets/VMware/open-instruct-v1.1-oasst-dolly-hhrlhf) is under cc-by-sa-3.0, and the Language Model ([openlm-research/open_llama_7b_preview_300bt](https://huggingface.co/openlm-research/open_llama_7b_preview_300bt/tree/main/open_llama_7b_preview_300bt_transformers_weights)) is under apache-2.0 License
|
16 |
+
|
|
|
17 |
|
18 |
## Nomenclature
|
19 |
+
|
20 |
+
- Model : Open-llama
|
21 |
+
- Model trained on : 300B or 0.3 T tokens
|
22 |
+
- Model Size: 7B parameters
|
23 |
+
- Dataset: Open-instruct-v1.1 (oasst,dolly, hhrlhf)
|
24 |
+
|
25 |
|
26 |
## Use in Transformers
|
27 |
|
|
|
64 |
```
|
65 |
|
66 |
## Drawbacks
|
67 |
+
|
68 |
+
- The model was trained on a partially trained Open-LLaMA checkpoint. (300B tokens).
|
69 |
+
- The model is inconsistent with outputting '\n' tokens as majority of the dataset is obtained from [mosaicml/dolly_hhrlhf](https://huggingface.co/datasets/mosaicml/dolly_hhrlhf) and that dataset removed newline characters from responses.
|
70 |
+
|
71 |
|
72 |
## Evaluation
|
73 |
|