Llama 2 Chat 70B for RK3588

This is a conversion from https://huggingface.co/meta-llama/Llama-2-70b-chat-hf to the RKLLM format for Rockchip devices. This runs on the NPU from the RK3588.

Convert to one file

Run:

cat llama2-chat-70b-hf-0* > llama2-chat-70b-hf.rkllm

But wait... will this run on my RK3588?

No. But I found interesting to see what happens if I converted it. Let's hope Microsoft never knows that I was using their SSDs as swap because they don't allow more than 32 GB RAM for the students subscription :P

image/png

And this is before finishing, it will probably get to 600 GBs of RAM + Swap.

But hey! You can always try yourself getting a 512GB SSD (and use around 100-250 GB as swap), a 32 GB of RAM SBC, have some patience and see if it loads. Good luck with that!

Main repo

See this for my full collection of converted LLMs for the RK3588's NPU:

https://huggingface.co/Pelochus/ezrkllm-collection

License

Same as the original LLM:

https://huggingface.co/meta-llama/Llama-2-70b-chat-hf/blob/main/LICENSE.txt

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .