Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
appvoidΒ 
posted an update Sep 28
Post
3312
700m parameters are the sweet spot for cpu usage, please let's make more of those!

How long did it take to reply and what are your context window limits? Model type?

Β·

How long did it take to reply and what are your context window limits? Model type?

it takes 3-5 seconds to reply when the prompt is longer than 30-50 words on average but it increases linearly with number of tokens in the prompt, the one on the picture is llama 3 1b but the one i'm using right now is arco 2 which is a llama model, cannot keep any kind of general knowledge, i noticed with qwen 2 (and later confirmed with meta's model) that you don't need a lot of parameters to get general knowledge, you just need tons of data

In this post