cognitivecomputations
/

dolphin-2.9-llama3-8b-256k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Resources

View closed (1)

Was trying to quantize to 8 bits to reduce VRAM footprint. Got the stuff below.

#3 opened 7 months ago by

pls help

#2 opened 7 months ago by