What is the chat TEMPLATE?

#1
by yuiaa001 - opened

I used ollama to run this model's gguf, but the response is similar to a base model.

This model is fine-tuned on Unsloth with RoPE scaling on the Alpaca Dataset. You can check out the fine-tuning script here: Unsloth GitHub. The goal of this fine-tuning was to extend the model's context length while preserving the properties of the base model. This is likely why you are receiving responses similar to the base model when using a context length of 8k or less. You may notice a difference in responses at higher context lengths.

If you want to use LLaMA3 models for context lengths greater than the pretrained length of 8k, I recommend using Ayush-1722/Meta-Llama-3-8B-Instruct-Summarize-v0.2-16K-LoRANET-Merged or Ayush-1722/Meta-Llama-3-8B-Instruct-Summarize-v0.2-24K-LoRANET-Merged. These models support context lengths of more than 80k while maintaining the properties of the base models.

Ayush-1722 changed discussion status to closed

Sign up or log in to comment