BelleGroup
/

BELLE-LLaMA-7B-2M-enc

Text2Text Generation

Model card Files Files and versions Community

mabaochang commited on Apr 12, 2023

Commit

732daf4

•

1 Parent(s): 0ce6066

Update README.md

Files changed (1) hide show

README.md +31 -4

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-license: other
 tags:
 - text2text-generation
 pipeline_tag: text2text-generation
@@ -53,12 +53,39 @@ c066b68b4139328e87a694020fc3a6c3  ./special_tokens_map.json.ca3d163bab0553818272
 39ec1b33fbf9a0934a8ae0f9a24c7163  ./tokenizer.model.9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347.enc
 ```
-2. Decrypt the files using https://github.com/LianjiaTech/BELLE/tree/main/models#使用说明
 ```
-for f in "encrypted"/*; do if [ -f "$f" ]; then python3 decrypt.py "$f" "original/7B/consolidated.00.pth" "result/"; fi; done
 ```
 3. Check md5sum
 ```
 md5sum ./*
 a57bf2d0d7ec2590740bc4175262610b  ./config.json
@@ -87,7 +114,7 @@ After you decrypt the files, BELLE-LLAMA-7B-2M can be easily loaded with LlamaFo
 from transformers import LlamaForCausalLM, AutoTokenizer
 import torch
-ckpt = './result/'
 device = torch.device('cuda')
 model = LlamaForCausalLM.from_pretrained(ckpt, device_map='auto', low_cpu_mem_usage=True)
 tokenizer = AutoTokenizer.from_pretrained(ckpt)

 ---
+license: gpl-3.0
 tags:
 - text2text-generation
 pipeline_tag: text2text-generation
 39ec1b33fbf9a0934a8ae0f9a24c7163  ./tokenizer.model.9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347.enc
 ```
+2. Decrypt the files using the scripts in https://github.com/LianjiaTech/BELLE/tree/main/models
+You can use the following command in Bash.
+Please replace "/path/to_encrypted" with the path where you stored your encrypted file,
+replace "/path/to_original_llama_7B" with the path where you stored your original llama7B file,
+and replace "/path/to_finetuned_model" with the path where you want to save your final trained model.
+```bash
+mkdir /path/to_finetuned_model
+for f in "/path/to_encrypted"/*; \
+    do if [ -f "$f" ]; then \
+       python3 decrypt.py "$f" "/path/to_original_llama_7B/consolidated.00.pth" "/path/to_finetuned_model/"; \
+    fi; \
+done
+```
+After executing the aforementioned command, you will obtain the following files.
 ```
+./config.json
+./generation_config.json
+./pytorch_model-00001-of-00002.bin
+./pytorch_model-00002-of-00002.bin
+./pytorch_model.bin.index.json
+./special_tokens_map.json
+./tokenizer_config.json
+./tokenizer.model
 ```
 3. Check md5sum
+You can verify the integrity of these files by performing an MD5 checksum to ensure their complete recovery.
+Here are the MD5 checksums for the relevant files:
 ```
 md5sum ./*
 a57bf2d0d7ec2590740bc4175262610b  ./config.json
 from transformers import LlamaForCausalLM, AutoTokenizer
 import torch
+ckpt = '/path/to_finetuned_model/'
 device = torch.device('cuda')
 model = LlamaForCausalLM.from_pretrained(ckpt, device_map='auto', low_cpu_mem_usage=True)
 tokenizer = AutoTokenizer.from_pretrained(ckpt)