jiyuanq
/

falcon-40b-instruct-gptq-128g-act

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

falcon-40b-instruct quantized with GPTQ using the script in https://github.com/huggingface/text-generation-inference/pull/438

group size: 128
act order: true
nsamples: 128
dataset: wikitext2

Downloads last month: 6

Safetensors

Model size

6.53B params

Tensor type

I64

·

I32

·

FP16

·

Inference Examples

Text Generation

Inference API (serverless) does not yet support model repos that contain custom code.