oobabooga4@alien.topB to LocalLLaMAEnglish · 1 year agoQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comexternal-linkmessage-square6fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comoobabooga4@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square6fedilink
minus-squarea_beautiful_rhind@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoFrom the issue about this in the exllamav2 repo, quip was using more memory and slower than exl. How much context can you fit?
From the issue about this in the exllamav2 repo, quip was using more memory and slower than exl. How much context can you fit?