oobabooga4@alien.topB to LocalLLaMAEnglish · 2 years agoQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comexternal-linkmessage-square6linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comoobabooga4@alien.topB to LocalLLaMAEnglish · 2 years agomessage-square6linkfedilink
minus-squareAntoItaly@alien.topBlinkfedilinkEnglisharrow-up1·2 years agoWow, with this quantization method, LLama 70B weighs only 17.5GB!
Wow, with this quantization method, LLama 70B weighs only 17.5GB!
Omg how can I run it on 3090?