oobabooga4@alien.topB to LocalLLaMAEnglish · 1 year agoQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comexternal-linkmessage-square6fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comoobabooga4@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square6fedilink
minus-squareAntoItaly@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoWow, with this quantization method, LLama 70B weighs only 17.5GB!
Wow, with this quantization method, LLama 70B weighs only 17.5GB!
Omg how can I run it on 3090?