minus-squaredreamingleo12@alien.topBtoLocalLLaMA•GitHub - S-LoRA/S-LoRA: S-LoRA: Serving Thousands of Concurrent LoRA AdapterslinkfedilinkEnglisharrow-up1·1 year agoI’m wondering though, from an engineering perspective, when traffic is high, wouldn’t this be causing a lot of weight switching? Basically limited by host to device bandwidth. linkfedilink
I’m wondering though, from an engineering perspective, when traffic is high, wouldn’t this be causing a lot of weight switching? Basically limited by host to device bandwidth.