@automatedstockminingorg on Hugging Face: "hi everyone, i have trained a Qwen 14b model on a smaller dataset, but its now…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

automatedstockminingorg

posted an update 28 days ago

Post

2355

hi everyone,
i have trained a Qwen 14b model on a smaller dataset, but its now very tricky because i have got nowhere to use it via inference (the paid for inference on hf costs quite a lot), does anyone know of anywhere where i can deploy my model and use it via api for a reasonable cost, or ideally none. thanks

John6666

28 days ago

The 14B model might just barely work with 16GB of VRAM in the free version of Google Colab, assuming 4-bit quantization at runtime.
I'm not familiar with Colab itself, so ask someone else how to use it.

skerit

27 days ago

You might want to give Predibase a try.

LeroyDyer

27 days ago

Download it and run it with lm studio the. Use the open ai to access it

hakutaku

27 days ago

glhf.chat has an API for any LLMs on huggingface for free, although it has a really low rate limit of 480 requests/8 hours (anyway it's free).

foscraft

27 days ago

Try google colab.
You can run it on the free tier.

joaomsimoes

26 days ago

Rent a VM in Runpod. I would recommend a 24gb VRAM and quantize the model to 8bit. You can use TGI and stop the VM when not in use. Or you can use serverless inference in runpod, it is also a great option for a small quantity of requests.

In this post