LocalLLaMA

3 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Cheapest GPU/Way to run 30b or 34b "Code" Models with GPT4ALL? (alien.top)

submitted 2 years ago by ForsookComparison@alien.top to c/localllama@poweruser.forum

0 comments fedilink hide all child comments

Currently running them on-CPU:

Ryzen 9 3950x
64gb DDR4 3200mhz
6700xt 12gb (does not fit much more than 13b models, so not relevant here)

While running on-CPU with GPT4All, I'm getting 1.5-2 tokens/sec. It finishes, but man is there a lot of waiting.

What's the most affordable way to get a faster experience? The two models I play with the most are Wizard-Vicuna 30b, and WizardCoder and CodeLlama 34b

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here