Default to eager attention
pinned
2
#1 opened 4 months ago
by
lysandre
Can anyone provider gptq-4bit or awq version for this model?
#4 opened 2 months ago
by
esoterikx
vllm load model error
1
#3 opened 4 months ago
by
Dharma0818
How much GPU memory is required to run this model?
#2 opened 4 months ago
by
Dharma0818