fix-glu-mlp

#17

by michael-guenther - opened Mar 28

base: refs/heads/main

←

from: refs/pr/17

Discussion Files changed

+10

-3

michael-guenther

Jina AI org Mar 28

•

edited Mar 28

The GluMLP is not working without flash attention, because the tensors are passed in a different shape. This PR fixes the issue. I also tested it that the embeddings with and without flash attentions are the same.

fix: glu for non-flash-attnc768124c

michael-guenther changed pull request status to open Mar 28

bwang0911

Jina AI org Apr 2

LGTM!

michael-guenther changed pull request status to merged Apr 2

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment