@Tight_Range_5690

Tight_Range_5690@alien.top · 1 year ago

looking at huggingface models, a raw 20b is ~42gb, not a lot of space to fit big model quants. Q4KM of 70b llama fits in that (q2 is 30gb). and the smallest falcon 180b quantization is 74gb

that would make more sense while still being really impressive. not sure if someone wants to math it out, but what’s the biggest B model that would fit in that on the lowest quants (q2-q3)?

disclaimer: bees are not everything, maybe they have great dataset/money/lies