Try to run with dedicated endpoint 4x A100 320GB still get not enough hardware capacity
#11
by
trungnx26
- opened
This comment has been hidden
trungnx26
changed discussion status to
closed
I'm having the same issue. Were you able to fix it?
trungnx26
changed discussion status to
open
I do believe this is huggingface issue with us-east-1. After many try, it work well with just A10.
And I'm trying with my 32gb of ram and a beloved 1060 6gb
Hi
@trungnx26
, may I ask which container type you used? Default or Text Generation Inference?
Also, can you tell us your specific endpoint settings? (AWS or GCP? Which Region?)
I have tried deploying in many regions, but it did not work. Thanks!
just Default for Text Generation Inference