6.0 bit exl2 quant (8 vut head) of Fireworks Hermes 2.5 fine tune of Mixtral-8x22b
Use Vicuna prompt template
needs ~ 120GB vRam (2xA100 or 3X RTX 6000)
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.