AutomataManifold@alien.topB to LocalLLaMAEnglish · 1 year agoGitHub - S-LoRA/S-LoRA: S-LoRA: Serving Thousands of Concurrent LoRA Adaptersgithub.comexternal-linkmessage-square5fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkGitHub - S-LoRA/S-LoRA: S-LoRA: Serving Thousands of Concurrent LoRA Adaptersgithub.comAutomataManifold@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square5fedilink
minus-squaredreamingleo12@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoI’m wondering though, from an engineering perspective, when traffic is high, wouldn’t this be causing a lot of weight switching? Basically limited by host to device bandwidth.
I’m wondering though, from an engineering perspective, when traffic is high, wouldn’t this be causing a lot of weight switching? Basically limited by host to device bandwidth.