Deployment to SageMaker - instance type?

#46

by MavWolverine - opened Jul 5

Discussion

MavWolverine

Jul 5

•

edited Jul 5

The Deploy to SageMaker instructions has

predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.2xlarge",
    container_startup_health_check_timeout=300,
  )

But it seems to run out of memory on "ml.g5.2xlarge".

What is the correct minimum instance type to get the model up and running in sagemaker?

snackary

Sep 26

Almost 3 months late here, but the Mistral docs suggest a minimum of 60GB of CPU RAM, which doesn’t seem possible without a Multi-GPU VM, so I’m guessing the ml.g5.2xlarge is a typo and should actually be ml.g5.12xlarge.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment