Add quantized ONNX weights
#3
by
Xenova
HF staff
- opened
No description provided.
Thanks for your interests. Do you have some specific preferences about the quantization? And may I ask you to give us more description about your use case?
Never mind, I misread your message.
It looks great. BTW, could I know more about the quantization methodology used in this PR? Is there some hackies we need to be aware?
numb3r3
changed pull request status to
merged
Hey! I applied the same quantization settings as all the other bert-based transformers.js models on the HF hub, like https://huggingface.co/Supabase/gte-small (see here for the full list). You can find a detailed list of settings applied here.