minus-squareJoseConseco_@alien.topBtoLocalLLaMA•ExLlamaV2: The Fastest Library to Run LLMslinkfedilinkEnglisharrow-up1·1 year agoSo how much vram would be required for 34b model or 14b model? I assume no cpu offloading right? With my 12gb vram, I guess I could only feed 14bilion parameters models, maybe even not that. linkfedilink
So how much vram would be required for 34b model or 14b model? I assume no cpu offloading right? With my 12gb vram, I guess I could only feed 14bilion parameters models, maybe even not that.