RWKV EagleX 7B v2 Model
!Important!: This is not meant to be used with huggingface transformers library
The following is the raw representation of the EagleX 7B Chat V0.5 model. For use with our own set of inference libraries and tools
This IS the chat/instruct tune (up to 100k context length)!!
Chat Format
The chat format is identical to the ChatML, the only difference is that the <|im_start|> and <|im_end|> special tokens have been replaced with the unused ASCII codes "\x16" and "\x17" respectively. This was done to ensure complete compatibility with the standard RWKV World tokenizer.
Example:
\x16system
You are Eagle, a very smart, creative and helpful personal assistant. Assist the user in any way you can with detailed and smart responses.\x17
\x16user
I want to cheer up my friend who's having a rough day. Can you suggest a couple short and sweet text messages to go with a kitten gif?\x17
\x16assistant
Sure, I can help you with that! Here are a couple of suggestions for text messages:
"You're the cutest thing ever! I hope you have a great day!"
"I know you're having a tough day, but you're amazing. Hang in there!"
"Thinking of you and sending you lots of love and light. You're amazing just the way you are."\x17
\x16user
Have you got anything even more sympathetic?\x17
\x16assistant
Sure, here are a couple of additional suggestions:
"I'm so sorry you're having a tough day. You're an amazing person, and I'm so proud of you."
"You're a wonderful person, and you're going through a tough time right now. Just remember that you're strong and capable, and you can get through this."
"I'm here for you, and I'm going to do everything I can to support you. You're amazing, and you're going to get through this."
"I'm so sorry you're having a tough day. You're an amazing person, and I'm so proud of you. You're going to get through this, and you're going to be even stronger and happier in the future."\x17
Quickstart with the PIP library
https://pypi.org/project/rwkv/
from rwkv.model import RWKV
from rwkv.utils import PIPELINE, PIPELINE_ARGS
# download models: https://huggingface.co/BlinkDL
model = RWKV(model='/fsx/BlinkDL/HF-MODEL/rwkv-4-pile-169m/RWKV-4-Pile-169M-20220807-8023', strategy='cpu fp32')
pipeline = PIPELINE(model, "20B_tokenizer.json") # 20B_tokenizer.json is in https://github.com/BlinkDL/ChatRWKV
# use pipeline = PIPELINE(model, "rwkv_vocab_v20230424") for rwkv "world" models
ctx = "\nIn a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese."
print(ctx, end='')
def my_print(s):
print(s, end='', flush=True)
# For alpha_frequency and alpha_presence, see "Frequency and presence penalties":
# https://platform.openai.com/docs/api-reference/parameter-details
args = PIPELINE_ARGS(temperature = 1.0, top_p = 0.7, top_k = 100, # top_k = 0 then ignore
alpha_frequency = 0.25,
alpha_presence = 0.25,
alpha_decay = 0.996, # gradually decay the penalty
token_ban = [0], # ban the generation of some tokens
token_stop = [], # stop generation whenever you see any token here
chunk_len = 256) # split input into chunks to save VRAM (shorter -> slower)
pipeline.generate(ctx, token_count=200, args=args, callback=my_print)
print('\n')
out, state = model.forward([187, 510, 1563, 310, 247], None)
print(out.detach().cpu().numpy()) # get logits
out, state = model.forward([187, 510], None)
out, state = model.forward([1563], state) # RNN has state (use deepcopy to clone states)
out, state = model.forward([310, 247], state)
print(out.detach().cpu().numpy()) # same result as above
print('\n')
Ramblings
Several new techniques were used to build the instruct dataset including the following:
- Smart packing of the instruct pairs (to improve long context multi turn conversation)
- Smart grouping of different context lengths and data categories/priorities (to improve training efficiency)
- Variable context length training (courtesy of https://github.com/RWKV/RWKV-infctx-trainer)
- A bunch of synthetic data to increase long context usage and reasoning (to be released soon...)
Acknowledgement
We are grateful for the help and support from the following key groups:
- Recursal.ai team for financing the GPU resources, and managing the training of this model - you can run the Eagle line of RWKV models on their cloud / on-premise platform today.
- Dataset built and model finetuned by @m8than
- EleutherAI for their support, especially in the v5/v6 Eagle/Finch paper
- Linux Foundation AI & Data group for supporting and hosting the RWKV project