ae_dataviz@alien.topB to

LocalLLaMAEnglish · 2 years ago

Quantizing 70b models to 4-bit, how much does performance degrade?

1

Quantizing 70b models to 4-bit, how much does performance degrade?

ae_dataviz@alien.topB to

LocalLLaMAEnglish · 2 years ago

The title, pretty much.

I’m wondering whether a 70b model quantized to 4bit would perform better than a 7b/13b/34b model at fp16. Would be great to get some insights from the community.

Chat

Herr_Drosselmeyer@alien.topB
link
fedilink
English
arrow-up
1·
2 years ago
It’s a rule of thumb that yes, higher parameter at low quant beats lower parameter at high quant (or no quant) but take it with a grain of salt as you may still prefer a lower parameter model that’s more tuned for the task you prefer.