The self._attn method of QwenAttention has a bug of attention_mask
#10
by
chencyudel
- opened
The self._attn method of QwenAttention need a fix of the bug lacking attention_mask adding to attention_weights
Thanks for the feedback. We have updated the code (as part of the support for batch inference), which I think should fix this problem as well. Please pull the latest code and see if the problem is fixed for you. Let me know if the problem still exists.
jklj077
changed discussion status to
closed