https://huggingface.co/deepnight-research
I’m not affiliated with this group at all, I was just randomly looking for any new big merges and found these.
100B model: https://huggingface.co/deepnight-research/saily_100B
220B model: https://huggingface.co/deepnight-research/Saily_220B
600B model: https://huggingface.co/deepnight-research/ai1
They have some big claims about the capabilities of their models, but the two best ones are unavailable to download. Maybe we can help convince them to release them publicly?
how much ram do you think the 600B would take? I have 512gb and I can fit another 512gb in my box before I run out of slots. I think with 1TB I should be able to run it unquantized because falcon 180b used slightly less than half my ram.
Can you please share a bit more about your setup and experiences?
I’ve been looking to use some of my idle enterprise gear for LLM’s but everyone tells me not to bother. I’ve got a few dual xeon boxes with quad channel DDR4 in 256 & 384GB capacities, NVMe or RAID10 SSDs, 10GBe, etc and I guess (having not yet experienced it) I have a hard time imagining that the equivalent of 120Ghz, 1/2 - 1tb of RAM and 7GB/s disk reads “not being fast enough.” I don’t need instant responses from a sex chatbot, rather I would like to run a model that can help my wife (in the medical field) with work queries, to help my school age kid with math and grammar questions, etc.
Thank you much!