Checkpoint "step115000-tokens482B" identical to main model?

#6
by amodaresi - opened

Hi,
The hashes for checkpoint step115000-tokens482B and the main model show that these two models are identical. (The same goes for the other shards too.)
image.png
image.png

Is it really an early stop or a misupload?

Also I have noticed that the nitro model is also identical to the "step651581-tokens2731B" checkpoint.
What exactly is the nitro revision? Is it the model before it's further tuned with learning rate annealing, as described in the paper?

Sign up or log in to comment