Is this based on the "Update (5/3)" version?

by Propheticus - opened May 4

May 4

•

Gradient uploaded a new version 18 hours ago claiming "Update (5/3): We further fine-tuned our model to strengthen its assistant-like chat ability as well. The NIAH result is updated." (also for their 1048k model btw)
Your quants were uploaded 5 hours ago and maybe you used the latest source, but it's so close to their update it could very well have been the previous version.

bartowski

Owner May 4

Yeah this is why I dislike model updates in place lmao, yes this is using the version from 18 hours ago

Propheticus

May 5

You were fast then! Which is good, but also...
I kind of secretly hoped it wasn't the latest and there was a chance the new version would be better in chat quality. This version makes formatting errors and is far less detailed/nuanced in its answers compared to the 8k base model. Then, like you already know, there's the odd</s>token.

bartowski

Owner May 5

I've made them aware of the issue but the rest I'm not sure what is needed

https://huggingface.co/gradientai/Llama-3-8B-Instruct-262k/discussions/20#66372ca74b43ab85e5ba5dbb

Propheticus

May 5

Ah well, at least we know it's not in the way you/llama.cpp does the quantisation.
It is a bit strange these issues like non capitalisation and the odd additional stop token were not caught by Gradient in their testing of the model.

bartowski

Owner May 5

I suppose most of their testing was automated and targetted at context retrieval rather than performance of output

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment