Update README.md
Browse files
README.md
CHANGED
@@ -30,7 +30,7 @@ This model has a slightly different architecture and training style:
|
|
30 |
2. Base model was pretrained on 75k instruction/response pairs and merged.
|
31 |
3. Similar architecture than palmer series but smaller in context size (8192)
|
32 |
|
33 |
-
In short, palmer is now half the size, twice the speed and same overall performance with a notable improvement on mmlu and arc challenge instead of winogrande.
|
34 |
|
35 |
As all palmer models, the model is biased to respond to answers without using any specific prompt, feel free to further fine-tune it for your specific use case.
|
36 |
|
|
|
30 |
2. Base model was pretrained on 75k instruction/response pairs and merged.
|
31 |
3. Similar architecture than palmer series but smaller in context size (8192)
|
32 |
|
33 |
+
In short, palmer is now half the size, twice the speed and same overall performance with a notable improvement on mmlu and arc challenge instead of winogrande. As of Wed 17 Jul, it beats all models =< 0.5b on hellaswag.
|
34 |
|
35 |
As all palmer models, the model is biased to respond to answers without using any specific prompt, feel free to further fine-tune it for your specific use case.
|
36 |
|