minus-square_qeternity_@alien.topBtoLocalLLaMA•How to minimize model inference costs?linkfedilinkEnglisharrow-up1·1 year agoBatched inference. linkfedilink
Batched inference.