this post was submitted on 16 Sep 2024
6 points (87.5% liked)

LocalLLaMA

2249 readers
1 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 1 year ago
MODERATORS
 

I just found https://www.arliai.com/ who offer LLM inference for quite cheap. Without rate-limits and unlimited token generation. No-logging policy and they have an OpenAI compatible API.

I've been using runpod.io previously but that's a whole different service as they sell compute and the customers have to build their own Docker images and run them in their cloud, by the hour/second.

Should I switch to ArliAI? Does anyone have some experience with them? Or can recommend another nice inference service? I still refuse to pay $1.000 for a GPU and then also pay for electricity when I can use some $5/month cloud service and it'd last me 16 years before I reach the price of buying a decent GPU...

Edit: Saw their $5 tier only includes models up to 12B parameters, so I'm not sure anymore. For larger models I'd need to pay close to what other inference services cost.

Edit2: I discarded the idea. 7B parameter models and one 12B one is a bit small to pay for. I can do that at home thanks to llama.cpp

you are viewing a single comment's thread
view the rest of the comments
[–] hendrik@palaver.p3x.de 1 points 1 month ago

Thanks. I'll try lmsys, but ultimately I do mind privacy. But I also fool around.

Yeah, I know about AMD GPUs. Nvidia has quite a monopoly on AI and as everyone uses their hardware and software frameworks, that's what's supported best. At least currently. My predicion is: that's about to change. But their competitors didn't do a great job. But I've been annoyed with Nvidia's stupid Linux drivers for so long, (I mean that also changed,) but I'd like to give my money to someone else, and swallow that pill. If I decide to do it anyways.

Thanks for the info. I think I can do something with that. Mistral-Nemo is pretty awesome for its size. Intelligent, can write prose, dialogue or answer questions, it's completely uncensored out of the box...