Deployment to SageMaker - instance type?

#46
by MavWolverine - opened

The Deploy to SageMaker instructions has

predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.2xlarge",
    container_startup_health_check_timeout=300,
  )

But it seems to run out of memory on "ml.g5.2xlarge".

What is the correct minimum instance type to get the model up and running in sagemaker?

Almost 3 months late here, but the Mistral docs suggest a minimum of 60GB of CPU RAM, which doesn’t seem possible without a Multi-GPU VM, so I’m guessing the ml.g5.2xlarge is a typo and should actually be ml.g5.12xlarge.

Sign up or log in to comment