I really wish there was a site where you could plug in your hardware and see what t/s speed you could expect from it, so if anyone has a link like that, I’d be interested. I haven’t been able to find one, and feel like I’m pretty much a noob when it comes to understanding what parts of hardware are important for local fine tuning and inference and running models, so please bear with me as I ask a bunch of probably dumb questions.
Broadly and in order, I think single GPU VRAM matters (more gb the better), then local RAM matters (same, but speed matters too I think?), then PCIE bus bandwidth speeds in gb/s matters, then additional GPU’s matter (for 60% and 30% and decreasing speedups from there), and finally CPU and/or NVME space might matter a little. Does that sound broadly correct?
So the situation is I’ve got a ton of 30 series NVIDIA GPU’s from a mining operation I wrapped up.
I could never sell them on r/hardwareswap or anywhere else, bc nobody would buy in bulk, and I’m sure the hell not wasting my time selling and shipping 75+ individual GPU’s to whoever. I do have racks and mobos and power supplies and whatever too, but I don’t think that matters. I also have a decent amount of 6800 and 6700xt and 5700xt AMD cards, but I don’t think that matters either - please correct me if I’m wrong.
I’d like to use as many GPU’s as possible for local fine tuning and inference, and am trying to figure out the best path for that. After reading about PCIE bandwidth and the speedups from 2 and 3 additional GPU’s, I’m afraid the real answer is “sell some GPU’s and buy an M2 Ultra Mac pro” or something like that, but if we couldn’t do that route, what is the best path forward?
An EPYC server build with as many 3090’s and 3080’s as I can fit and either 96gb (2 sticks, full DDR5 speed) or 192gb (4 sticks, only DDR4 speed) of ram? Which ram config is better? I think the DDR5 vs DDR4 speed actually makes a difference, but am not sure how much of a difference.
Researching EPYC mobos, I think I can fit maybe 6 or 7 GPU’s into an EPYC build, does that sound about right? Anyone know of any PCIE-rich mobo’s or architectures that I could fit notably more GPU’s than that into? I do have a bunch of mining mobo’s, but don’t think they’re usable?
I’m pretty sure there’s nothing possible like a beowulf cluster of mining boards + GPU’s that you can use for model fine tuning / running, is that correct?
I also have a Threadripper linux box I could upgrade that can currently fit 4-6 GPU’s, and could upgrade to an AM5 mobo and a 79503xd CPU pretty easily. I don’t know how this stacks up against an EPYC build, does anyone have any ideas on that?
I looked up my current linux box mobo and the PCIE lanes only have 32gb/s bandwidth, so think a mobo upgrade to AM5 with 128gb/s would be necessary to get decent speeds, does that sound right?
Sorry for all the questions and my general lack of knowledge, any guidance or suggestions on maximizing a bunch of GPU’s are very welcome.
Magic number is about 8-10 in a single box with a drop in perf across CPUs. Probably needs risers to fit.
More than 6 is not really needed unless going for finetuning or full precision.
Besides epyc boards, old xeons work as well. Supermicro made some servers for this. Can be had used and you get your cooling and shiznit all in one package.
https://www.supermicro.com/en/products/system/4U/4029/SYS-4029GP-TRT.cfm
https://www.supermicro.com/products/system/4U/4028/SYS-4028GR-TRT.cfm
“sell some GPU’s and buy an M2 Ultra Mac pro”
lol, no. Or maybe yes. Make a GPU server and sell the rest to buy a 192g mac for the best of both worlds.
I’m sure the hell not wasting my time selling and shipping 75+ individual GPU’s to whoever
If this was all 3090s, you are sitting on $52k of hardware. People don’t make that in a year.
Thanks much for the links to specific mobos, I appreciate it. With riser splitters, I can see fitting 10 gpu’s in.
It really sounds like people think the “sell them and get real hardware” route is the best, but I really don’t have the time. Plus, they’re not all 3090’s, there’s a ton of lesser 30’s like 3070 and 3060’s too.
Standing offer to anyone in the thread - if you want to make an easy 25%, I’ll sell the lot of them to you for 75% the average used price on Ebay, and you can sell them all individually yourself.
Separate out the 3090s for yourself, keep a couple of lesser cards for fun stuff and then put it up as a lot on ebay with their approximate price. Beats trusting random internet strangers.
I’m sure something like 10 3060s will sell if you price it right.
Yeah, I’m pretty much doing this (keeping all 3090’s and water-cooled 3080’s), so your advice is solid.
When I first wrapped up my mining operation I had my niece try a giant lot of the GPU’s for a 15% cut (she sells stuff on ebay), and didn’t get any takers in a couple months with progressively lowered prices. She sold about $5k worth of cards individually, but that was it, wasn’t really worth it.
Guess it’s worth trying again via separating them into 3060ti and 3070 and whatever next time I’m traveling to her city and can drop them off to her.
One GPU per layer is an interesting approach ;)
You could build an infiniband cluster. The 3090 would give you most bang for buck. Though it’s a lot more work than trading out for A100s, and the extra hardware will cost. You can get 9 GPUs on an single epyc server mobo and still have good bandwidth. So we are talking about manually sourcing and building 10 boxes.
But unless you are training stuff and have cheap electricity a cluster probably doesn’t make sense. No idea why you would need ~1800GB vram.
No idea why you would need ~1800GB vram.
Homeboy’s waifu is gonna be THICC.
Thanks for pointing me to Infiniband, another thing for me to research. Sounds like a high-bandwidth supercomputer information coordination layer, so sort of like the beowulf cluster idea.
I actually do have cheap electricity thanks to solar + honking big lifepo4 battery bank.
Is this what AWS and other places where you can rent time on H100’s do? Have a bunch of A100 and H100 servers hooked up in arrays with Infiniband?
Hold onto those 3090’s. Juice labs has a really interesting open source project that allows you to combine multiple gpus into a virtual GPU over ip.
Going to be giving it a shot myself in the next few weeks.(I have a pretty decent understanding of it at this point, feel like trading a 3090 for set-up assistance? 😉)