minus-squareIll_Initiative_8793@alien.topBtoLocalLLaMA•Running large models below requirements?linkfedilinkEnglisharrow-up1·1 year agollama.cpp and upload some layers to VRAM, you may be able to run 70B, depends on quantization. linkfedilink
llama.cpp and upload some layers to VRAM, you may be able to run 70B, depends on quantization.