Text Generation
Transformers
Safetensors
English
llama
nvidia
llama3.1
conversational
text-generation-inference

How to inference it on a 40 GB A100 and 80 GB Ram of Colab PRO?

#17
by SadeghPouriyan - opened

I want to use this model on colab pro and i have 40 bg a100 and 80 gb ram of the runtime. what is the best practise to use it on this system?

Sign up or log in to comment