alchemist1e9@alien.topB to LocalLLaMAEnglish · 1 year agoExLlamaV2: The Fastest Library to Run LLMstowardsdatascience.comexternal-linkmessage-square22fedilinkarrow-up11arrow-down10file-text
arrow-up11arrow-down1external-linkExLlamaV2: The Fastest Library to Run LLMstowardsdatascience.comalchemist1e9@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square22fedilinkfile-text
minus-squareJoseConseco_@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoSo how much vram would be required for 34b model or 14b model? I assume no cpu offloading right? With my 12gb vram, I guess I could only feed 14bilion parameters models, maybe even not that.
So how much vram would be required for 34b model or 14b model? I assume no cpu offloading right? With my 12gb vram, I guess I could only feed 14bilion parameters models, maybe even not that.