35
How to selfhost a llm? (programming.dev)

I'm trying to figure out how to host one myself. I'm trying to use barvarder and localai. But I am failing due to not enough knowledge and missing instructions. Any advice? did someone succeed with anything? I'd be happy to make other smaller steps at first as well. As long as I get somewhere.

top 8 comments
sorted by: hot top controversial new old
[-] c10l@lemmy.world 16 points 6 months ago

It’s pretty easy with Ollama. Install it, then ollama run mistral-7b (or another model, there’s a few available ootb). https://ollama.ai/

Another option is Llamafile. https://github.com/Mozilla-Ocho/llamafile

[-] hazeebabee@slrpnk.net 2 points 6 months ago

Sounds like a really cool project, sadly i dont have much knowledge to contribute. Still, what kind of issues have you run into? Any specific errors or problems?

[-] jvrava9@lemmy.dbzer0.com 2 points 6 months ago

Maybe Serge would fit your use case.

[-] das@lemellem.dasonic.xyz 1 points 6 months ago

Surge is probably the easiest way to get a basic setup. If you just want to download a model and chat, I recommend it.

[-] das@lemellem.dasonic.xyz 2 points 6 months ago

If you want to be able to get into the nitty gritty or play with options besides just a chat, I recommend Text Generation WebUI.

Installing is pretty easy, then you just download your desired model from Hugging Face.

Or if you want to use it for roleplay or adventure style games, KoboldCPP is easy to set up.

[-] DontNoodles@discuss.tchncs.de 1 points 6 months ago

I've heard good things about H2O AI if you want to self host and tweak the model by uploading documents of your own (so that you get answers based on your dataset). I'm not sure how difficult it is. Maybe someone more knowledgeable will chime in.

[-] Sims@lemmy.ml 1 points 6 months ago

If low on hw then look into petals or the kobold horde frameworks. Both share models in a p2p fashion afaik.

Petals at least, lets you create private networks, so you could host some of a model on your 24/7 server, some on your laptop CPU and the rest on your laptop GPU - as an example.

Haven't tried tho, so good luck ;)

[-] Aties@lemmy.world 1 points 6 months ago

I haven't looked into specific apps, but I have been wanting to try various trained models and figured just self hosting jupyterhub and getting models from hugging face would be a quick and flexible way to do it

this post was submitted on 26 Dec 2023
35 points (97.3% liked)

Selfhosted

37770 readers
313 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS