16
submitted 10 months ago by cll7793@lemmy.world to c/fosai@lemmy.world

I know it is possible to split layers between Nvidia GPUs with cublas. But with AMD and ROCm, it is far more difficult, and maybe not implemented yet for any project?

top 2 comments
sorted by: hot top controversial new old
[-] j4k3@lemmy.world 1 points 10 months ago

I haven't, but I can say when I was researching hardware a month ago, I came across a telemetry source from a stable diffusion add-on. In that data there was clear evidence of cloud instances running banks of 7900xtx cards. The 7k series are supposedly the only ones supported by hips. I didn't see any other Radeon cards that had the same signs of use in a data center. Even in this instance, all the cards could be on separate containers or whatever the correct term is for a cloud instance. I expect someone would connect them all for doing model training. Not really helpful, I know.

Honestly, the code base is not that hard to parse, especially if you have a capable LLM running and let it explain snippets and small functions.

AMD GPU support appears to be included in GGML. I don't see any reason why you wouldn't be able to split between multiple GPUs as the splitting is handled within GGML itself and not tied to any particular library/driver/backend.

this post was submitted on 24 Aug 2023
16 points (90.0% liked)

Free Open-Source Artificial Intelligence

2678 readers
1 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

Have no idea where to begin with AI/LLMs? Try visiting our Lemmy Crash Course for Free Open-Source AI. When you're done with that, head over to FOSAI ▲ XYZ or check out the FOSAI LLM Guide for more info.

More AI Communities

AI Resources

Learn

Build

Serve

Fediverse / FOSAI

LLM Leaderboards

LLM Search Tools

LLM Evaluations

GitHub Projects

Documentation Theory

founded 1 year ago
MODERATORS