https://huggingface.co/deepnight-research
I’m not affiliated with this group at all, I was just randomly looking for any new big merges and found these.
100B model: https://huggingface.co/deepnight-research/saily_100B
220B model: https://huggingface.co/deepnight-research/Saily_220B
600B model: https://huggingface.co/deepnight-research/ai1
They have some big claims about the capabilities of their models, but the two best ones are unavailable to download. Maybe we can help convince them to release them publicly?
They have lifted the gate from the 100B model… Seems like it’s pending evaluation in the OpenLLM Leaderboard. They’re also saying they’ll lift the gate from the 220B model before Christmas
Let’s see if all this is just a publicity stunt or really they did it.
how much ram do you think the 600B would take? I have 512gb and I can fit another 512gb in my box before I run out of slots. I think with 1TB I should be able to run it unquantized because falcon 180b used slightly less than half my ram.
Can you please share a bit more about your setup and experiences?
I’ve been looking to use some of my idle enterprise gear for LLM’s but everyone tells me not to bother. I’ve got a few dual xeon boxes with quad channel DDR4 in 256 & 384GB capacities, NVMe or RAID10 SSDs, 10GBe, etc and I guess (having not yet experienced it) I have a hard time imagining that the equivalent of 120Ghz, 1/2 - 1tb of RAM and 7GB/s disk reads “not being fast enough.” I don’t need instant responses from a sex chatbot, rather I would like to run a model that can help my wife (in the medical field) with work queries, to help my school age kid with math and grammar questions, etc.
Thank you much!
if you have the ram don’t worry about disk at all. if you have to drop to any kind of disk even if it’s gen 5 ssd you speeds will tank. memory bandwidth matters so much more than compute for LLMs, but it all depends on your needs. there are probably cheaper ways to go about this if you just need something occasionally. maybe runpod or something, but if you need a lot of inference then locally could save you money, but renting a big machine with a100s will always be faster. so will a 7B model do what you need or do you need the accuracy and comprehension of a 70b or one of the new 120b merges? also llama3 is supposed to be out in jan/feb and if it’s significantly better then everything changes again.
We need some 4090s with 500gb VRAM modified in China if possible.
I wonder if there’s any real demand for even 48GB 4090s enough to incentives somebody to do it. I bet the hardware/electronics part of it is trivial, tho.
If people started doing this with any regularity, nVidia would intentionally bork the drivers.
We need some hero to develop an app that downloads more GPU memory like those apps back in the 90’s.
/s
the devs mentioned that the 600B model takes about 1,3TB space alone…
Make it 0.01bpm quantized and you will fit in good ol’ 3090.
Give it 5 years with the Mac Studio. Next year 256gb, will go up real quick.
Honestly, a 4bit quantized version of the 220B model should run on a 192GB M2 Studio, assuming these models could even work with a current transformer/loader.
“organisations”…
It’s the best out there…. But no you can’t try it because it’s to dangerous.
I doubt there is any model really… follow the trail, you’ll end up at a company founded by single person from India (who is founder of another company with a single app for collaborative drawing)… that at least doesn’t have any employees on LinkedIn…
And the founder looks like a relatively young person that most likely wouldn’t be even able to gather the required funding to have enough GPU compute for making model that’s better than gpt4 (or know how). I think that’s just a front for him trying to get some hype or funding.
Uuummmm no. It’s for sure real. And the best one out there. No questions asked. It’s better that CHATGPT 4 and OpenAI has been trying to hack this new company to get the 600b model because they are scared that it will end OpenAI for good.
Obligatory /s
You forgot to mention that your uncle is the CEO of OpenAI! 😉
Well that’s because he’s not. Sam is actually my dad.
https://in.linkedin.com/company/deepnight
View 1 employee
Work experience: Google Startup Alumni
lmao
A cursory look at the website makes me think these guys don’t know what they are doing
Everything on that page is hype for something that doesn’t exist.
Right. This part right here is very suspicious to me, and I’m taking their claims with a grain of salt.
No! The model is not going to be available publically. APOLOGIES. The model like this can be misused very easily. The model is only going to be provided to already selected organisations.
I think they changed it to it’s still an experiment and they are finishing evaluations to better understand the model.
No they haven’t, on the 220B model it’s always been that message above, while on the 600B model it’s a message similar to the one you stated.
I guess they might open source the 600B one? They have different names, so maybe different training approaches.
Somebody pilfer this thing and quant it. We can run the 100B for sure. At least at Q3.
“Prompt Template: Alpeca” Wut?
Looks like a scam to be fair. I bet if you apply, you’ll get “Just send us 100$ for access!”
Those Microsoft tech support scam calls are reaching new level.
so it sounds like for the 600b they just finetuned llama2 again with the same stuff Llama2 was trained with, just more of it…
RefinedWeb
Opensource code from GitHub
Common Crawl we fine-tuned the model on a huge dataset (generated manually and with automation) for logical understanding and reasoning. We also trained the model for function calling capabilities.
Its private so there is absolutely 0 way to confirm its quality
Not just private but closed access.
Some quotes I found on the pages:
“No! The model is not going to be available publically. APOLOGIES. The model like this can be misused very easily. The model is only going to be provided to already selected organisations.”
“[SOMETHING SPECIAL]: AIN’T DISCLOSING!🧟”
“Hallucinations: Reduced Hallucinations 8x compared to ChatGPT 🥳”
My guess: it’s just another merge like Goliath. At best it’s marginally better than a good 70B.
I can also “successfully build 220B model” easily with mergekit. Would it be good? Probably not.
The lab should write on their model card why should I not think it’s just bullshit. Not exactly the first mystery lab making big claims.
I doubt there’s any model there.
Wonder if GPT4 is just a series of merges
Inb4 The Bloke Quantizes it to about 100B size.
We need a different flair for
New Model
s vsNew Merge/Finetune
Np I’ll quantize to 0.001 bpw
I don’t understand, are all these models based on Llama? How much better is 100b than goliath 120b? there are a lot of questions. As far as we know, Goliath was made by an AI lover. Did the team make these three models?
Deepnight were the guys that uploaded upstage’s instruct v2, claimed it was their own, then deleted with an oopsie whoopsie.
I am skeptical.