Pooling method: mean vs last?

#25
by alexzhou689 - opened

Same to title, which one should i choose for inference or training?

Alibaba-NLP org

recommending to use the last token pooling method, please refer to the example code in the model introduction.

Sign up or log in to comment