yeah, but not a 4:1 ratio. overhead is usually a small bit on top. a 1000 GiB SSD may actually be 1100 GiB internally, but the controller would not show that.
yeah, but not a 4:1 ratio. overhead is usually a small bit on top. a 1000 GiB SSD may actually be 1100 GiB internally, but the controller would not show that.
In my case it’s an Epyc 7642 with 8x64GB DDR4 2666, so that may be why my generation is significantly slower.
I find anything below 5 tokens per second not really usable, so that’s why I stick with my M1 Ultra. It has plenty of really fast RAM and that again explains most likely why it performs so well, if LLMs are that dependend on fast memory.
I also have a 3090 in another machine but that’s also just 24gb and I don’t want to shell out more money right now for playing with LLMs, if the M1 Ultra is doing good enough :)
I tried a 70B model on my 48 core epyc with 512gb RAM and it was unusable. I think 1.5t/s or so? Even if you double that it’s not great. My M1 Ultra runs it comfortably at 6-7t/s and sips power.
Probably a dual 3090 setup would be the most cost effective solution at the moment while the M1/M2 ultra are the most power efficient solution.
When I saw “gods” my first thought was “eh, those are dogs, not cats, right?”
Cats would 100% believe themselves to be gods.
But “gods” being a typo of “dogs” makes so much more sense
For sure! But the M1 ultra still holds up really well. I doubt I will replace it for another 3 years at the very least. Currently CPUs are progressing at an impressive rate across the board. Would I like an M3 ultra? Sure, but do I really need it? Sadly no :) The upgrade to an M5 ultra will be insane though.