add shareded model example

#5
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -35,6 +35,14 @@ quantized_by: MaziyarPanahi
35
  ## Description
36
  [MaziyarPanahi/WizardLM-2-8x22B-GGUF](https://huggingface.co/MaziyarPanahi/WizardLM-2-8x22B-GGUF) contains GGUF format model files for [microsoft/WizardLM-2-8x22B](https://huggingface.co/microsoft/WizardLM-2-8x22B).
37
 
 
 
 
 
 
 
 
 
38
 
39
  ## Prompt template
40
 
 
35
  ## Description
36
  [MaziyarPanahi/WizardLM-2-8x22B-GGUF](https://huggingface.co/MaziyarPanahi/WizardLM-2-8x22B-GGUF) contains GGUF format model files for [microsoft/WizardLM-2-8x22B](https://huggingface.co/microsoft/WizardLM-2-8x22B).
37
 
38
+ ## Load sharded model
39
+
40
+ `llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.
41
+
42
+ ```sh
43
+ llama.cpp/main -m WizardLM-2-8x22B.Q2_K-00001-of-00005.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 1024 -e
44
+ ```
45
+
46
 
47
  ## Prompt template
48