vicuna-13b-v1.1 / README.md
ootb's picture
Init
eb23f7c
|
raw
history blame
1.81 kB
# Vicuna-13B-V1.1
Vicuna 13B model weights.
- 2023.04.16 Obtain the Vicuna weights by merging the LLaMA-13B model and Vicuna delta weights v1.1, and upload to the huggingfae.co model repository https://huggingface.co/uukuguy/vicuna-13b-v1.1
```bash
# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co/uukuguy/vicuna-13b-v1.1
# if you want to clone without large files – just their pointers
# prepend your git clone with the following env var:
GIT_LFS_SKIP_SMUDGE=1
```
## Model Card
### Model details
Model type: Vicuna is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. It is an auto-regressive language model, based on the transformer architecture.
Model date: Vicuna-13B-V1.1 weights was merged in April 2023.
Organizations developing the model: The Vicuna team with members from UC Berkeley, CMU, Stanford, and UC San Diego.
Paper or resources for more information: https://vicuna.lmsys.org/
License: Apache License 2.0
Where to send questions or comments about the model: https://github.com/uukuguy/Vicuna-LoRA/issues
### Intended use
Primary intended uses: The primary use of Vicuna is research on large language models and chatbots.
Primary intended users: The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.
### Major updates of weights v1.1
Refactor the tokenization and separator. In Vicuna v1.1, the separator has been changed from "###" to the EOS token "</s>". This change makes it easier to determine the generation stop criteria and enables better compatibility with other libraries.
Fix the supervised fine-tuning loss computation for better model quality.