Upload folder using huggingface_hub
Browse files
README.md
CHANGED
@@ -1,6 +1,23 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
# K2-Chat: a fully-reproducible large language model outperforming Llama 2 70B Chat using 35% less compute
|
5 |
|
6 |
K2 Chat is finetuned from [K2-65B](https://huggingface.co/LLM360/K2). K2 Chat outperforms Llama 2-70B-Chat on all evaluations conducted. The model also outperforms Llama 3-70B-Instruct on coding tasks.
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
**Exllamav2** quant (**exl2** / **3.5 bpw**) made with ExLlamaV2 v0.1.1
|
5 |
+
|
6 |
+
Other EXL2 quants:
|
7 |
+
| **Quant** | **Model Size** | **lm_head** |
|
8 |
+
| ----- | ---------- | ------- |
|
9 |
+
|<center>**[2.2](https://huggingface.co/Zoyd/LLM360_K2-Chat-2_2bpw_exl2)**</center> | <center>17685 MB</center> | <center>6</center> |
|
10 |
+
|<center>**[2.5](https://huggingface.co/Zoyd/LLM360_K2-Chat-2_5bpw_exl2)**</center> | <center>20000 MB</center> | <center>6</center> |
|
11 |
+
|<center>**[3.0](https://huggingface.co/Zoyd/LLM360_K2-Chat-3_0bpw_exl2)**</center> | <center>23857 MB</center> | <center>6</center> |
|
12 |
+
|<center>**[3.5](https://huggingface.co/Zoyd/LLM360_K2-Chat-3_5bpw_exl2)**</center> | <center>27721 MB</center> | <center>6</center> |
|
13 |
+
|<center>**[3.75](https://huggingface.co/Zoyd/LLM360_K2-Chat-3_75bpw_exl2)**</center> | <center>29647 MB</center> | <center>6</center> |
|
14 |
+
|<center>**[4.0](https://huggingface.co/Zoyd/LLM360_K2-Chat-4_0bpw_exl2)**</center> | <center>31549 MB</center> | <center>6</center> |
|
15 |
+
|<center>**[4.25](https://huggingface.co/Zoyd/LLM360_K2-Chat-4_25bpw_exl2)**</center> | <center>33505 MB</center> | <center>6</center> |
|
16 |
+
|<center>**[5.0](https://huggingface.co/Zoyd/LLM360_K2-Chat-5_0bpw_exl2)**</center> | <center>39300 MB</center> | <center>6</center> |
|
17 |
+
|<center>**[6.0](https://huggingface.co/Zoyd/LLM360_K2-Chat-6_0bpw_exl2)**</center> | <center>46927 MB</center> | <center>8</center> |
|
18 |
+
|<center>**[6.5](https://huggingface.co/Zoyd/LLM360_K2-Chat-6_5bpw_exl2)**</center> | <center>50613 MB</center> | <center>8</center> |
|
19 |
+
|<center>**[8.0](https://huggingface.co/Zoyd/LLM360_K2-Chat-8_0bpw_exl2)**</center> | <center>49516 MB</center> | <center>8</center> |
|
20 |
+
|
21 |
# K2-Chat: a fully-reproducible large language model outperforming Llama 2 70B Chat using 35% less compute
|
22 |
|
23 |
K2 Chat is finetuned from [K2-65B](https://huggingface.co/LLM360/K2). K2 Chat outperforms Llama 2-70B-Chat on all evaluations conducted. The model also outperforms Llama 3-70B-Instruct on coding tasks.
|