jartine commited on
Commit
224f275
1 Parent(s): da37751

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -61,7 +61,7 @@ For further information, please see the [llamafile
61
  README](https://github.com/mozilla-ocho/llamafile/).
62
 
63
  Having **trouble?** See the ["Gotchas"
64
- section](https://github.com/mozilla-ocho/llamafile/?tab=readme-ov-file#gotchas)
65
  of the README.
66
 
67
  ## About Upload Limits
@@ -117,7 +117,7 @@ Your choice of quantization format depends on three things:
117
 
118
  1. Will it fit in RAM or VRAM?
119
  2. Is your use case reading (e.g. summarization) or writing (e.g. chatbot)?
120
- 3. llamafiles bigger than 4.30 GB are hard to run on Windows (see [gotchas](https://github.com/mozilla-ocho/llamafile/?tab=readme-ov-file#gotchas))
121
 
122
  Good quants for writing (prediction speed) are Q5\_K\_M, and Q4\_0. Text
123
  generation is bounded by memory speed, so smaller quants help, but they
 
61
  README](https://github.com/mozilla-ocho/llamafile/).
62
 
63
  Having **trouble?** See the ["Gotchas"
64
+ section](https://github.com/mozilla-ocho/llamafile/?tab=readme-ov-file#gotchas-and-troubleshooting)
65
  of the README.
66
 
67
  ## About Upload Limits
 
117
 
118
  1. Will it fit in RAM or VRAM?
119
  2. Is your use case reading (e.g. summarization) or writing (e.g. chatbot)?
120
+ 3. llamafiles bigger than 4.30 GB are hard to run on Windows (see [gotchas](https://github.com/mozilla-ocho/llamafile/?tab=readme-ov-file#gotchas-and-troubleshooting))
121
 
122
  Good quants for writing (prediction speed) are Q5\_K\_M, and Q4\_0. Text
123
  generation is bounded by memory speed, so smaller quants help, but they