Update README.md
Browse files
README.md
CHANGED
@@ -26,10 +26,10 @@ NetBSD for AMD64 and ARM64.
|
|
26 |
|
27 |
## Quickstart
|
28 |
|
29 |
-
Running the following
|
30 |
-
browser with a chatbot interface.
|
31 |
|
32 |
```
|
|
|
33 |
chmod +x llm-compiler-13b-ftd.Q6_K.llamafile
|
34 |
./llm-compiler-13b-ftd.Q6_K.llamafile --help
|
35 |
```
|
@@ -152,8 +152,7 @@ of consuming significantly less memory.
|
|
152 |
|
153 |
### Q6\_K, F16, BF16
|
154 |
|
155 |
-
The output for
|
156 |
-
Q6\_K, F16, or BF16 weights is:
|
157 |
|
158 |
```asm
|
159 |
.globl hiho
|
@@ -172,16 +171,10 @@ hiho: movl %edi, %ecx
|
|
172 |
retq
|
173 |
```
|
174 |
|
175 |
-
This implementation is **CORRECT**.
|
176 |
-
|
177 |
-
NOTE: BF16 is currently only supported on CPU. It's the best quant for
|
178 |
-
prompt processing on Zen4.
|
179 |
|
180 |
### Q5\_0
|
181 |
|
182 |
-
The output for the first xdigit() assembly optimization example with
|
183 |
-
Q5\_0 weights is:
|
184 |
-
|
185 |
```asm
|
186 |
.globl hiho
|
187 |
hiho: movl %edi, %ecx
|
@@ -204,15 +197,10 @@ hiho: movl %edi, %ecx
|
|
204 |
retq
|
205 |
```
|
206 |
|
207 |
-
This implementation is **INCORRECT**.
|
208 |
-
unneeded stack spillage and branches. Therefore this quantization format
|
209 |
-
was rejected and isn't included in this repository.
|
210 |
|
211 |
### Q5\_K\_M
|
212 |
|
213 |
-
The output for the first xdigit() assembly optimization example with
|
214 |
-
Q5\_K\_M weights is:
|
215 |
-
|
216 |
```asm
|
217 |
.globl hiho
|
218 |
hiho: movl $48, %eax
|
@@ -245,14 +233,11 @@ hiho: movl $48, %eax
|
|
245 |
retq
|
246 |
```
|
247 |
|
248 |
-
This implementation is **INCORRECT**.
|
249 |
-
unneeded stack spillage and branches. Therefore this quantization format
|
250 |
-
was rejected and isn't included in this repository.
|
251 |
|
252 |
### Q8\_0
|
253 |
|
254 |
-
The output for
|
255 |
-
Q8\_0 weights is:
|
256 |
|
257 |
```asm
|
258 |
.globl hiho
|
@@ -286,9 +271,7 @@ hiho: movl $48, %eax
|
|
286 |
retq
|
287 |
```
|
288 |
|
289 |
-
This implementation is **INCORRECT**.
|
290 |
-
unneeded stack spillage and branches. Therefore this quantization format
|
291 |
-
was rejected and isn't included in this repository.
|
292 |
|
293 |
---
|
294 |
|
|
|
26 |
|
27 |
## Quickstart
|
28 |
|
29 |
+
Running the following will display the help:
|
|
|
30 |
|
31 |
```
|
32 |
+
wget https://huggingface.co/Mozilla/llm-compiler-13b-ftd-llamafile/resolve/main/llm-compiler-13b-ftd.Q6_K.llamafile
|
33 |
chmod +x llm-compiler-13b-ftd.Q6_K.llamafile
|
34 |
./llm-compiler-13b-ftd.Q6_K.llamafile --help
|
35 |
```
|
|
|
152 |
|
153 |
### Q6\_K, F16, BF16
|
154 |
|
155 |
+
The output for optimizing xdigit() is:
|
|
|
156 |
|
157 |
```asm
|
158 |
.globl hiho
|
|
|
171 |
retq
|
172 |
```
|
173 |
|
174 |
+
This implementation is **CORRECT**.
|
|
|
|
|
|
|
175 |
|
176 |
### Q5\_0
|
177 |
|
|
|
|
|
|
|
178 |
```asm
|
179 |
.globl hiho
|
180 |
hiho: movl %edi, %ecx
|
|
|
197 |
retq
|
198 |
```
|
199 |
|
200 |
+
This implementation is **INCORRECT**.
|
|
|
|
|
201 |
|
202 |
### Q5\_K\_M
|
203 |
|
|
|
|
|
|
|
204 |
```asm
|
205 |
.globl hiho
|
206 |
hiho: movl $48, %eax
|
|
|
233 |
retq
|
234 |
```
|
235 |
|
236 |
+
This implementation is **INCORRECT**.
|
|
|
|
|
237 |
|
238 |
### Q8\_0
|
239 |
|
240 |
+
The output for optimizing xdigit() is:
|
|
|
241 |
|
242 |
```asm
|
243 |
.globl hiho
|
|
|
271 |
retq
|
272 |
```
|
273 |
|
274 |
+
This implementation is **INCORRECT**.
|
|
|
|
|
275 |
|
276 |
---
|
277 |
|