• 0 Posts
  • 1 Comment
Joined 1 year ago
cake
Cake day: November 2nd, 2023

help-circle
  • looking at huggingface models, a raw 20b is ~42gb, not a lot of space to fit big model quants. Q4KM of 70b llama fits in that (q2 is 30gb). and the smallest falcon 180b quantization is 74gb

    that would make more sense while still being really impressive. not sure if someone wants to math it out, but what’s the biggest B model that would fit in that on the lowest quants (q2-q3)?

    disclaimer: bees are not everything, maybe they have great dataset/money/lies