Throwaway4669332255

joined 2 years ago
[–] Throwaway4669332255@lemmy.world 7 points 1 month ago (2 children)

IT litterally costs $3000

Thats almost 4 time the cost of my 3090.

[–] Throwaway4669332255@lemmy.world -2 points 2 months ago

Where would you even move to?

[–] Throwaway4669332255@lemmy.world 23 points 2 months ago (1 children)

Cold rooms have improved my sleep and made my life better.

I've wasted SO MANY hours arguing with stupid people on reddit....

[–] Throwaway4669332255@lemmy.world 9 points 4 months ago (4 children)

But i'll trickle down? Right? RIGHT??

[–] Throwaway4669332255@lemmy.world 5 points 5 months ago (3 children)

Anything but google.

[–] Throwaway4669332255@lemmy.world 16 points 8 months ago (1 children)

Idk man I've yet to know anyone who died from drinking magma.

[–] Throwaway4669332255@lemmy.world 6 points 11 months ago (14 children)

How does the Nemo 12B compare to the Llama 3.1 8B?

Apparently I am an idiot and read the wrong paper. The previous paper mentioned that "comparable with the 8-bit models"

https://huggingface.co/papers/2310.11453

[–] Throwaway4669332255@lemmy.world 1 points 1 year ago (2 children)

They said their's is "comparable with the 8-bit models". Its all tradeoffs. It isn't clear to me where you allocate your compute/memory budget. I've noticed that full 7b 16 bit models often produce better results for me than some much larger quantied models. It will be interesting to find the sweet spot.

[–] Throwaway4669332255@lemmy.world 2 points 1 year ago (4 children)

So are more bits less important than more paramters? Would a higher paramter or higher bit count matter more if the models ended up the same size?

I'm so glad I work for a medium-small company. We moved to a smaller office and only require to go in twice a month

view more: next ›