Llama3-ChatQA-1.5-8B-256K
I tried to achive long context RAG pipeline with this model but I have very limited resources to test this workflow. Keep in mind that this is an experimentation.
This model is an 'amalgamation' of winglian/llama-3-8b-256k-PoSE
and nvidia/Llama3-ChatQA-1.5-8B
.
Recipe
First I extracted the Lora adapter from nvidia/Llama3-ChatQA-1.5-8B
using mergekkit
. You can find the adapter here.
After the extraction I merged the adapter with the winglian/llama-3-8b-256k-PoSE
model.
Prompt Format
Since base model wasn't finetuned for any specific format we can use the ChatQA's chat format.
System: {System}
{Context}
User: {Question}
Assistant: {Response}
User: {Question}
Assistant:
Big thanks to Meta Team, Nvidia Team and of course Wing Lian.
Notes
This model has not been tested on any benchmarks due to compute limitations. Base model wasn't evaluated using Needle in Haystack
as well. There is a big possibility that this model might perform worse than both of the original models.
- Downloads last month
- 16
Model tree for beratcmn/Llama3-ChatQA-1.5-8B-256K
Base model
meta-llama/Meta-Llama-3-8B