Deployment to SageMaker - instance type?
#46
by
MavWolverine
- opened
The Deploy to SageMaker instructions has
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
container_startup_health_check_timeout=300,
)
But it seems to run out of memory on "ml.g5.2xlarge".
What is the correct minimum instance type to get the model up and running in sagemaker?
Almost 3 months late here, but the Mistral docs suggest a minimum of 60GB of CPU RAM, which doesn’t seem possible without a Multi-GPU VM, so I’m guessing the ml.g5.2xlarge
is a typo and should actually be ml.g5.12xlarge
.