Transformers
GGUF
llama
text-generation-inference
TheBloke commited on
Commit
277c65a
1 Parent(s): bde5e92

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -12
README.md CHANGED
@@ -105,28 +105,17 @@ Refer to the Provided Files table below to see what files use which methods, and
105
  | ---- | ---- | ---- | ---- | ---- | ----- |
106
  | [puddlejumper-13b.Q2_K.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q2_K.gguf) | Q2_K | 2 | 5.43 GB| 7.93 GB | smallest, significant quality loss - not recommended for most purposes |
107
  | [puddlejumper-13b.Q3_K_S.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q3_K_S.gguf) | Q3_K_S | 3 | 5.66 GB| 8.16 GB | very small, high quality loss |
108
- | [puddlejumper-13b.q2_K.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q2_K.gguf) | q2_K | 2 | 5.66 GB| 8.16 GB | smallest, significant quality loss - not recommended for most purposes |
109
- | [puddlejumper-13b.q3_K_S.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q3_K_S.gguf) | q3_K_S | 3 | 5.87 GB| 8.37 GB | very small, high quality loss |
110
  | [puddlejumper-13b.Q3_K_M.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q3_K_M.gguf) | Q3_K_M | 3 | 6.34 GB| 8.84 GB | very small, high quality loss |
111
- | [puddlejumper-13b.q3_K_M.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q3_K_M.gguf) | q3_K_M | 3 | 6.55 GB| 9.05 GB | very small, high quality loss |
112
  | [puddlejumper-13b.Q3_K_L.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q3_K_L.gguf) | Q3_K_L | 3 | 6.93 GB| 9.43 GB | small, substantial quality loss |
113
- | [puddlejumper-13b.q3_K_L.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q3_K_L.gguf) | q3_K_L | 3 | 7.14 GB| 9.64 GB | small, substantial quality loss |
114
  | [puddlejumper-13b.Q4_0.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q4_0.gguf) | Q4_0 | 4 | 7.37 GB| 9.87 GB | legacy; small, very high quality loss - prefer using Q3_K_M |
115
  | [puddlejumper-13b.Q4_K_S.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q4_K_S.gguf) | Q4_K_S | 4 | 7.41 GB| 9.91 GB | small, greater quality loss |
116
- | [puddlejumper-13b.q4_K_S.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q4_K_S.gguf) | q4_K_S | 4 | 7.61 GB| 10.11 GB | small, greater quality loss |
117
  | [puddlejumper-13b.Q4_K_M.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q4_K_M.gguf) | Q4_K_M | 4 | 7.87 GB| 10.37 GB | medium, balanced quality - recommended |
118
- | [puddlejumper-13b.q4_K_M.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q4_K_M.gguf) | q4_K_M | 4 | 8.06 GB| 10.56 GB | medium, balanced quality - recommended |
119
  | [puddlejumper-13b.Q4_1.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q4_1.gguf) | Q4_1 | 4 | 8.17 GB| 10.67 GB | legacy; small, substantial quality loss - lprefer using Q3_K_L |
120
- | [puddlejumper-13b.q5_0.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q5_0.gguf) | q5_0 | 5 | 8.95 GB| 11.45 GB | legacy; medium, balanced quality - prefer using Q4_K_M |
121
  | [puddlejumper-13b.Q5_0.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q5_0.gguf) | Q5_0 | 5 | 8.97 GB| 11.47 GB | legacy; medium, balanced quality - prefer using Q4_K_M |
122
  | [puddlejumper-13b.Q5_K_S.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q5_K_S.gguf) | Q5_K_S | 5 | 8.97 GB| 11.47 GB | large, low quality loss - recommended |
123
- | [puddlejumper-13b.q5_K_S.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q5_K_S.gguf) | q5_K_S | 5 | 9.15 GB| 11.65 GB | large, low quality loss - recommended |
124
  | [puddlejumper-13b.Q5_K_M.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q5_K_M.gguf) | Q5_K_M | 5 | 9.23 GB| 11.73 GB | large, very low quality loss - recommended |
125
- | [puddlejumper-13b.q5_K_M.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q5_K_M.gguf) | q5_K_M | 5 | 9.40 GB| 11.90 GB | large, very low quality loss - recommended |
126
  | [puddlejumper-13b.Q5_1.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q5_1.gguf) | Q5_1 | 5 | 9.78 GB| 12.28 GB | legacy; medium, low quality loss - prefer using Q5_K_M |
127
  | [puddlejumper-13b.Q6_K.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q6_K.gguf) | Q6_K | 6 | 10.68 GB| 13.18 GB | very large, extremely low quality loss |
128
- | [puddlejumper-13b.q6_K.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q6_K.gguf) | q6_K | 6 | 10.83 GB| 13.33 GB | very large, extremely low quality loss |
129
- | [puddlejumper-13b.q8_0.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.q8_0.gguf) | q8_0 | 8 | 13.83 GB| 16.33 GB | very large, extremely low quality loss - not recommended |
130
  | [puddlejumper-13b.Q8_0.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q8_0.gguf) | Q8_0 | 8 | 13.83 GB| 16.33 GB | very large, extremely low quality loss - not recommended |
131
 
132
  **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
@@ -184,7 +173,7 @@ CT_METAL=1 pip install ctransformers>=0.2.24 --no-binary ctransformers
184
  from ctransformers import AutoModelForCausalLM
185
 
186
  # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
187
- llm = AutoModelForCausalLM.from_pretrained("TheBloke/PuddleJumper-13B-GGML", model_file="puddlejumper-13b.q4_K_M.gguf", model_type="llama", gpu_layers=50)
188
 
189
  print(llm("AI is going to"))
190
  ```
 
105
  | ---- | ---- | ---- | ---- | ---- | ----- |
106
  | [puddlejumper-13b.Q2_K.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q2_K.gguf) | Q2_K | 2 | 5.43 GB| 7.93 GB | smallest, significant quality loss - not recommended for most purposes |
107
  | [puddlejumper-13b.Q3_K_S.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q3_K_S.gguf) | Q3_K_S | 3 | 5.66 GB| 8.16 GB | very small, high quality loss |
 
 
108
  | [puddlejumper-13b.Q3_K_M.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q3_K_M.gguf) | Q3_K_M | 3 | 6.34 GB| 8.84 GB | very small, high quality loss |
 
109
  | [puddlejumper-13b.Q3_K_L.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q3_K_L.gguf) | Q3_K_L | 3 | 6.93 GB| 9.43 GB | small, substantial quality loss |
 
110
  | [puddlejumper-13b.Q4_0.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q4_0.gguf) | Q4_0 | 4 | 7.37 GB| 9.87 GB | legacy; small, very high quality loss - prefer using Q3_K_M |
111
  | [puddlejumper-13b.Q4_K_S.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q4_K_S.gguf) | Q4_K_S | 4 | 7.41 GB| 9.91 GB | small, greater quality loss |
 
112
  | [puddlejumper-13b.Q4_K_M.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q4_K_M.gguf) | Q4_K_M | 4 | 7.87 GB| 10.37 GB | medium, balanced quality - recommended |
 
113
  | [puddlejumper-13b.Q4_1.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q4_1.gguf) | Q4_1 | 4 | 8.17 GB| 10.67 GB | legacy; small, substantial quality loss - lprefer using Q3_K_L |
 
114
  | [puddlejumper-13b.Q5_0.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q5_0.gguf) | Q5_0 | 5 | 8.97 GB| 11.47 GB | legacy; medium, balanced quality - prefer using Q4_K_M |
115
  | [puddlejumper-13b.Q5_K_S.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q5_K_S.gguf) | Q5_K_S | 5 | 8.97 GB| 11.47 GB | large, low quality loss - recommended |
 
116
  | [puddlejumper-13b.Q5_K_M.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q5_K_M.gguf) | Q5_K_M | 5 | 9.23 GB| 11.73 GB | large, very low quality loss - recommended |
 
117
  | [puddlejumper-13b.Q5_1.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q5_1.gguf) | Q5_1 | 5 | 9.78 GB| 12.28 GB | legacy; medium, low quality loss - prefer using Q5_K_M |
118
  | [puddlejumper-13b.Q6_K.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q6_K.gguf) | Q6_K | 6 | 10.68 GB| 13.18 GB | very large, extremely low quality loss |
 
 
119
  | [puddlejumper-13b.Q8_0.gguf](https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF/blob/main/puddlejumper-13b.Q8_0.gguf) | Q8_0 | 8 | 13.83 GB| 16.33 GB | very large, extremely low quality loss - not recommended |
120
 
121
  **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
 
173
  from ctransformers import AutoModelForCausalLM
174
 
175
  # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
176
+ llm = AutoModelForCausalLM.from_pretrained("TheBloke/PuddleJumper-13B-GGUF", model_file="puddlejumper-13b.q4_K_M.gguf", model_type="llama", gpu_layers=50)
177
 
178
  print(llm("AI is going to"))
179
  ```