Are there bias weights in Llama3 ?

#202

by Iionbarista - opened Jul 11

Jul 11

and found that there are no designated weights for biases?

Does Llama have no biases or is it implicitly loaded from the weights?

Jul 11

Or is replaced by the layernorm?

Jul 22

Google Palm paper mentioned:

No biases were used in any of the dense kernels or layer norms. We found this to result in increased training stability for large models.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment