NousResearch
/

Yarn-Llama-2-70b-32k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

emozilla commited on Nov 16, 2023

Commit

e610773

•

1 Parent(s): bb88106

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -9,12 +9,15 @@ datasets:
 - emozilla/yarn-train-tokenized-8k-llama
 ---
-# Model Card: Nous-Yarn-Llama-2-70b-32k
 [Preprint (arXiv)](https://arxiv.org/abs/2309.00071)
 [GitHub](https://github.com/jquesnelle/yarn)
 ![yarn](https://raw.githubusercontent.com/jquesnelle/yarn/70b/data/proofpile-long-small-32k-70b.csv.png)
 ## Model Description
 Nous-Yarn-Llama-2-70b-32k is a state-of-the-art language model for long context, further pretrained on long context data for 400 steps using the YaRN extension method.
@@ -55,6 +58,3 @@ Short context benchmarks showing that quality degradation is minimal:
  - [@theemozilla](https://twitter.com/theemozilla): Methods, paper, model training, and evals
  - [@EnricoShippole](https://twitter.com/EnricoShippole): Model training
  - [honglu2875](https://github.com/honglu2875): Paper and evals
-The authors would like to thank LAION AI for their support of compute for this model.
-It was trained on the [JUWELS](https://www.fz-juelich.de/en/ias/jsc/systems/supercomputers/juwels) supercomputer.

 - emozilla/yarn-train-tokenized-8k-llama
 ---
+# Model Card: Yarn-Llama-2-70b-32k
 [Preprint (arXiv)](https://arxiv.org/abs/2309.00071)
 [GitHub](https://github.com/jquesnelle/yarn)
 ![yarn](https://raw.githubusercontent.com/jquesnelle/yarn/70b/data/proofpile-long-small-32k-70b.csv.png)
+The authors would like to thank [LAION AI](https://laion.ai/) for their support of compute for this model.
+It was trained on the [JUWELS](https://www.fz-juelich.de/en/ias/jsc/systems/supercomputers/juwels) supercomputer.
 ## Model Description
 Nous-Yarn-Llama-2-70b-32k is a state-of-the-art language model for long context, further pretrained on long context data for 400 steps using the YaRN extension method.
  - [@theemozilla](https://twitter.com/theemozilla): Methods, paper, model training, and evals
  - [@EnricoShippole](https://twitter.com/EnricoShippole): Model training
  - [honglu2875](https://github.com/honglu2875): Paper and evals