Technology

998 readers

110 users here now

A tech news sub for communists

founded 2 years ago

MODERATORS

muad_dibber@lemmygrad.ml

LocalAI is the free, open source locally run drop-in replacement REST API for OpenAI (github.com)

submitted 1 month ago by yogthos@lemmygrad.ml to c/technology@lemmygrad.ml

15 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] JucheStalin@lemmygrad.ml 2 points 1 month ago (10 children)

So it's a fancy proxy to existing AI offerings?

[–] yogthos@lemmygrad.ml 12 points 1 month ago (9 children)

It's a way to run models on your local machine and provide an API that's compatible with OpenAI that can be used by apps that normally rely on that.

[–] JucheStalin@lemmygrad.ml 2 points 1 month ago (3 children)

Hm so it downloads fixed models and works without an internet connection? Interesting.

[–] yogthos@lemmygrad.ml 2 points 1 month ago (1 children)

Right, you can download any publicly available model and run it without using the internet. Caveat is that you do need a relatively fast machine to make it performant.

[–] FuckBigTech347@lemmygrad.ml 3 points 1 month ago (1 children)

For reference the oldest card I have that Vulkan supports is an RX 560 that I bought in 2017 (I'm on GNU/Linux w/ amdgpu and the RADV mesa driver aka. "The Default"). Most medium models on it run at around 6 - 10 Tokens/s. Some crawl to below 6 Tokens/s though and become slower the longer the answer they output is, probably because parts of the model is in RAM since that card has "only" 4GB of VRAM. Models that fully fit in VRAM are a lot faster.

[–] KrasnaiaZvezda@lemmygrad.ml 1 points 1 month ago

I can run Qwen 2.5 Coder 14B Q4_k_m on CPU at only a little above 1 t/s but it's worth it when I just want to have it look at whatever code I have without disclosing it with corporations that don't have my best interests in mind.

load more comments (1 replies)

load more comments (6 replies)