Update README.md
Browse files
README.md
CHANGED
@@ -47,6 +47,12 @@ Ooba use: Be sure to increase the `Truncate the prompt up to this length` parame
|
|
47 |
- Overall, it appears that YaRN is capable of extending the context window with minimal impact to short context performance, when compared to other methods. Furthermore, it's able to do this with a FAR higher scaling factor, which with other methods (especially PI), resulted in serious performance degradation at shorter context lengths.
|
48 |
- Both the YaRN and Code LLama papers suggest that YaRN and NTK scaling may ameliorate the issue of "U shaped" attention to some degree, where long context models struggle to attend to information in the middle of the context window. Further study is needed to evaluate this. Anecdotal feedback from the community on this issue would be appreciated!
|
49 |
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
## Prompting:
|
51 |
|
52 |
Prompting differs with the airoboros 2.1 models. See [jondurbin/airoboros-l2-13b-2.1](https://huggingface.co/jondurbin/airoboros-l2-13b-2.1)
|
|
|
47 |
- Overall, it appears that YaRN is capable of extending the context window with minimal impact to short context performance, when compared to other methods. Furthermore, it's able to do this with a FAR higher scaling factor, which with other methods (especially PI), resulted in serious performance degradation at shorter context lengths.
|
48 |
- Both the YaRN and Code LLama papers suggest that YaRN and NTK scaling may ameliorate the issue of "U shaped" attention to some degree, where long context models struggle to attend to information in the middle of the context window. Further study is needed to evaluate this. Anecdotal feedback from the community on this issue would be appreciated!
|
49 |
|
50 |
+
### Benchmarks
|
51 |
+
|
52 |
+
ARC (25 shot): 60.32
|
53 |
+
Hellaswag (10 shot): 83.90
|
54 |
+
MMLU (5 shot): 54.39
|
55 |
+
|
56 |
## Prompting:
|
57 |
|
58 |
Prompting differs with the airoboros 2.1 models. See [jondurbin/airoboros-l2-13b-2.1](https://huggingface.co/jondurbin/airoboros-l2-13b-2.1)
|