jartine commited on
Commit
5aa7125
1 Parent(s): 7cbae06

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -34,6 +34,11 @@ chmod +x llm-compiler-13b-ftd.F16.llamafile
34
  ./llm-compiler-13b-ftd.F16.llamafile --help
35
  ```
36
 
 
 
 
 
 
37
  On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
38
  the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card
39
  driver needs to be installed. If the prebuilt DSOs should fail, the CUDA
 
34
  ./llm-compiler-13b-ftd.F16.llamafile --help
35
  ```
36
 
37
+ This model has a max context window size of 16k tokens. The `.args` file
38
+ inside these llamafiles have been configured to specify `-c 0 --temp 0`
39
+ so that the max context size is used by default, and randomness is
40
+ disabled by default too (since it's unhelpful for this model).
41
+
42
  On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
43
  the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card
44
  driver needs to be installed. If the prebuilt DSOs should fail, the CUDA