Update README.md
Browse files
README.md
CHANGED
@@ -34,6 +34,11 @@ chmod +x llm-compiler-13b-ftd.F16.llamafile
|
|
34 |
./llm-compiler-13b-ftd.F16.llamafile --help
|
35 |
```
|
36 |
|
|
|
|
|
|
|
|
|
|
|
37 |
On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
|
38 |
the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card
|
39 |
driver needs to be installed. If the prebuilt DSOs should fail, the CUDA
|
|
|
34 |
./llm-compiler-13b-ftd.F16.llamafile --help
|
35 |
```
|
36 |
|
37 |
+
This model has a max context window size of 16k tokens. The `.args` file
|
38 |
+
inside these llamafiles have been configured to specify `-c 0 --temp 0`
|
39 |
+
so that the max context size is used by default, and randomness is
|
40 |
+
disabled by default too (since it's unhelpful for this model).
|
41 |
+
|
42 |
On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
|
43 |
the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card
|
44 |
driver needs to be installed. If the prebuilt DSOs should fail, the CUDA
|