lmsys
/

vicuna-7b-v1.1

Text Generation

text-generation-inference

Model card Files Files and versions Community

TheBloke commited on Apr 13, 2023

Commit

f80b1d9

•

1 Parent(s): 3060ef2

Update README.md

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

@@ -7,6 +7,20 @@ This is an HF version of the [Vicuna 7B 1.1 model](https://huggingface.co/lmsys/
 It was created by merging the deltas provided in the above repo with the original Llama 7B model, [using the code provided on their Github page](https://github.com/lm-sys/FastChat#vicuna-weights).
 # Vicuna Model Card
 ## Model details

 It was created by merging the deltas provided in the above repo with the original Llama 7B model, [using the code provided on their Github page](https://github.com/lm-sys/FastChat#vicuna-weights).
+## My Vicuna 1.1 model repositories
+I have the following Vicuna 1.1 repositories available:
+**13B models:**
+* [Unquantized 13B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-13B-1.1-HF)
+* [GPTQ quantized 4bit 13B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g)
+* [GPTQ quantized 4bit 13B 1.1 for CPU - GGML format for `llama.cpp`](https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g-GGML)
+**7B models:**
+* [Unquantized 7B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-7B-1.1-HF)
+* [GPTQ quantized 4bit 7B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g)
+* [GPTQ quantized 4bit 7B 1.1 for CPU - GGML format for `llama.cpp`](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g-GGML)
 # Vicuna Model Card
 ## Model details