minus-squarechewbie@alien.topBtoLocalLLaMA•Macs with 32GB of memory can run 70B models with the GPU.linkfedilinkEnglisharrow-up1·1 year agoDoes anyone know how many stream of LLAMA 2 70b a apple studio can run in parrallel ? Does it need the same amount of ram for each completion, or does llama.cpp manage to share it between different stream ? linkfedilink
Does anyone know how many stream of LLAMA 2 70b a apple studio can run in parrallel ? Does it need the same amount of ram for each completion, or does llama.cpp manage to share it between different stream ?