Clearing up confusion: GPT 3.5-Turbo may not be 20b after all

SomeOddCodeGuy@alien.top · 2 years ago

Clearing up confusion: GPT 3.5-Turbo may not be 20b after all

ttkciar@alien.top · 2 years ago

Perhaps someone heard “10x reduction in footprint” and didn’t realize that meant a reduction in bytes, not a reduction in parameters, and concluded it had a tenth as many parameters?

Tight_Range_5690@alien.top · 2 years ago

looking at huggingface models, a raw 20b is ~42gb, not a lot of space to fit big model quants. Q4KM of 70b llama fits in that (q2 is 30gb). and the smallest falcon 180b quantization is 74gb

that would make more sense while still being really impressive. not sure if someone wants to math it out, but what’s the biggest B model that would fit in that on the lowest quants (q2-q3)?

disclaimer: bees are not everything, maybe they have great dataset/money/lies

ambient_temp_xeno@alien.top · 2 years ago

So they, as big-shot microsoft scientists, just decided that was good enough to stick it in a table in their paper?

2muchnet42day@alien.top · 2 years ago

“Yes, we made a mistake, we totally don’t have direct knowledge about this”