File size: 3,218 Bytes
7cdf421 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# 1. Prepare Vicuna Checkpoint
The language decoder of NExT-GPT relies on Vicuna version 0 which is an open-source LLaMA-based LLM.
However, due to the distribution license of LLaMA, manual restoration of Vicuna's weights is required.
Below are the instructions for restoring these weights.
(These original instruction comes from the [PandaGPT](https://github.com/yxuansu/PandaGPT)).
## 1.1. Prepare LLaMA Weights
* Request the original weights of LLaMA from Meta by filling [this form](https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform).
* After obtaining the weights of a specific LLaMA (e.g. 7B, 13B), following [instructions](https://huggingface.co/docs/transformers/main/model_doc/llama) provided by Huggingface to convert it into Huggingface format.
> **** After conversion, the directory should look like:
.
βββ ./{path_to_llama_weights}/
β βββ config.json
β βββ generation_config.json
β βββ pytorch_model-00001-of-00002.bin
β βββ pytorch_model-00002-of-00002.bin
β βββ pytorch_model.bin.index.json
β βββ special_tokens_map.json
β βββ tokenizer.model
β βββ tokenizer_config.json
`{path_to_llama_weights}` is where you store the checkpoints.
## 1.2. Prepare the Delta Weights of Vicuna
Then, you should download the delta weights of Vicuna provided by the original authors. You can find the corresponding links to 7B/13B Vicuna models in the table below.
|**Model Size**|**Delta Weights Address**|**Version**|
|:-------------:|:-------------:|:-------------:|
|7B|[[Link]](https://huggingface.co/lmsys/vicuna-7b-delta-v0)|0|
|13B|[[Link]](https://huggingface.co/lmsys/vicuna-13b-delta-v0)|0|
> **** After conversion, the directory should look like:
.
βββ ./{path_to_delta_vicuna_weights}/
βββ config.json
βββ generation_config.json
βββ pytorch_model-00001-of-00002.bin
βββ pytorch_model-00002-of-00002.bin
βββ pytorch_model.bin.index.json
βββ special_tokens_map.json
βββ tokenizer.model
βββ tokenizer_config.json
`{path_to_delta_vicuna_weights}` is where you store the delta weights of Vicuna.
## 1.3. Combine the Weights:
When the two sets of weights are ready, you can combine them using tools from the Vicuna team.
First, install the required library.
```yaml
pip install git+https://github.com/lm-sys/FastChat.git@v0.1.10
```
Then, run the following command.
```yaml
python -m fastchat.model.apply_delta --base {path_to_llama_weights} --target ./vicuna_ckpt/7b_v0/ --delta {path_to_delta_vicuna_weights}
```
> **** Now, the final weights are ready as:
.
βββ ./vicuna_ckpt/7b_v0/
βββ config.json
βββ generation_config.json
βββ pytorch_model-00001-of-00002.bin
βββ pytorch_model-00002-of-00002.bin
βββ pytorch_model.bin.index.json
βββ special_tokens_map.json
βββ tokenizer.model
βββ tokenizer_config.json |