AutomataManifold@alien.topB to LocalLLaMAEnglish · 2 years agoGitHub - S-LoRA/S-LoRA: S-LoRA: Serving Thousands of Concurrent LoRA Adaptersgithub.comexternal-linkmessage-square5linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkGitHub - S-LoRA/S-LoRA: S-LoRA: Serving Thousands of Concurrent LoRA Adaptersgithub.comAutomataManifold@alien.topB to LocalLLaMAEnglish · 2 years agomessage-square5linkfedilink
minus-squaredreamingleo12@alien.topBlinkfedilinkEnglisharrow-up1·2 years agoI’m wondering though, from an engineering perspective, when traffic is high, wouldn’t this be causing a lot of weight switching? Basically limited by host to device bandwidth.
I’m wondering though, from an engineering perspective, when traffic is high, wouldn’t this be causing a lot of weight switching? Basically limited by host to device bandwidth.