minus-squareDavidSJ@alien.topBtoLocalLLaMA•Macs with 32GB of memory can run 70B models with the GPU.linkfedilinkEnglisharrow-up1·1 year ago There will hopefully be more optimizations to speed this up. Speculative, Jacobi, or lookahead decoding could speed things up quite a bit. linkfedilink
Speculative, Jacobi, or lookahead decoding could speed things up quite a bit.