Spaces:

Qwen
/

QwQ-32B-preview

Running

App Files Files Community

Hardware Requirements

by Lightchain - opened 4 days ago

Discussion

Lightchain

4 days ago

Very interesting model. Does anyone have info on what hardware is required to run it?

alpindale

Qwen org 4 days ago

You will need ~80GB of memory for inference at 16bit. Half that for 8bit, and a quarter that for 4bit.

richwardle

4 days ago

I just ran 16bit on an A100 SXM w/ 80GB of vram

sszymczyk

4 days ago

•

edited 4 days ago

With llama.cpp this model with Q4_K_M quantization and 15000 context size fits on a single RTX 3090 or 4090 (24GB VRAM). Its performance doesn't seem to be affected much - at least based on my limited testing on a set of 50 reasoning puzzles.

Ainonake

4 days ago

You can also run it on cpu if you have 32gb ram.

gptahmed1

1 day ago

•

edited 1 day ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment