Spaces:
Runtime error
Runtime error
8-bit quantization model
#2
by
mrm8488
- opened
As seen here: https://huggingface.co/spaces/bertin-project/bertin-gpt-j-6B/discussions/1#633aeb9acbdbadd99c070c74
With the new feature that automatically quantizes the model weights to 8 bits, IMHO, It does not make sense to create a separated and already quantized model. What do you think
@versae
?
Yeah. It seems the LoRA work might not be maintained in the future, so maybe using the int8 feature in transformers
is the way to go. As I see it, there should be some way to serialize the model in int8 so we can create a branch in the model repo that automatically loads in int8.
I have already done it with the latest ckpt (https://huggingface.co/mrm8488/bertin-gpt-j-6B-ES-v1-8bit). Do I create a branch and push it there?
That'd be great!