Mozilla
/

llm-compiler-13b-ftd-llamafile

Model card Files Files and versions Community

jartine commited on Jun 29

Commit

0d16a68

•

1 Parent(s): 1550162

Update README.md

Files changed (1) hide show

README.md +8 -25

README.md CHANGED Viewed

@@ -26,10 +26,10 @@ NetBSD for AMD64 and ARM64.
 ## Quickstart
-Running the following on a desktop OS will launch a tab in your web
-browser with a chatbot interface.
 ```
 chmod +x llm-compiler-13b-ftd.Q6_K.llamafile
 ./llm-compiler-13b-ftd.Q6_K.llamafile --help
 ```
@@ -152,8 +152,7 @@ of consuming significantly less memory.
 ### Q6\_K, F16, BF16
-The output for the first xdigit() assembly optimization example with
-Q6\_K, F16, or BF16 weights is:
 ```asm
         .globl  hiho
@@ -172,16 +171,10 @@ hiho:   movl    %edi, %ecx
         retq
 ```
-This implementation is **CORRECT**. This implementation is **GOOD**.
-NOTE: BF16 is currently only supported on CPU. It's the best quant for
-prompt processing on Zen4.
 ### Q5\_0
-The output for the first xdigit() assembly optimization example with
-Q5\_0 weights is:
 ```asm
         .globl  hiho
 hiho:   movl    %edi, %ecx
@@ -204,15 +197,10 @@ hiho:   movl    %edi, %ecx
         retq
 ```
-This implementation is **INCORRECT**. Even if it did work, it has
-unneeded stack spillage and branches. Therefore this quantization format
-was rejected and isn't included in this repository.
 ### Q5\_K\_M
-The output for the first xdigit() assembly optimization example with
-Q5\_K\_M weights is:
 ```asm
 	.globl	hiho
 hiho:   movl    $48, %eax
@@ -245,14 +233,11 @@ hiho:   movl    $48, %eax
         retq
 ```
-This implementation is **INCORRECT**. Even if it did work, it has
-unneeded stack spillage and branches. Therefore this quantization format
-was rejected and isn't included in this repository.
 ### Q8\_0
-The output for the first xdigit() assembly optimization example with
-Q8\_0 weights is:
 ```asm
         .globl  hiho
@@ -286,9 +271,7 @@ hiho:   movl    $48, %eax
         retq
 ```
-This implementation is **INCORRECT**. Even if it did work, it has
-unneeded stack spillage and branches. Therefore this quantization format
-was rejected and isn't included in this repository.
 ---

 ## Quickstart
+Running the following will display the help:
 ```
+wget https://huggingface.co/Mozilla/llm-compiler-13b-ftd-llamafile/resolve/main/llm-compiler-13b-ftd.Q6_K.llamafile
 chmod +x llm-compiler-13b-ftd.Q6_K.llamafile
 ./llm-compiler-13b-ftd.Q6_K.llamafile --help
 ```
 ### Q6\_K, F16, BF16
+The output for optimizing xdigit() is:
 ```asm
         .globl  hiho
         retq
 ```
+This implementation is **CORRECT**.
 ### Q5\_0
 ```asm
         .globl  hiho
 hiho:   movl    %edi, %ecx
         retq
 ```
+This implementation is **INCORRECT**.
 ### Q5\_K\_M
 ```asm
 	.globl	hiho
 hiho:   movl    $48, %eax
         retq
 ```
+This implementation is **INCORRECT**.
 ### Q8\_0
+The output for optimizing xdigit() is:
 ```asm
         .globl  hiho
         retq
 ```
+This implementation is **INCORRECT**.
 ---