rihard7854@alien.topB to LocalLLaMAEnglish · 1 year agoNVidia H200 achieves nearly 12,000 tokens/sec on Llama2-13B with TensorRT-LLMgithub.comexternal-linkmessage-square23fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkNVidia H200 achieves nearly 12,000 tokens/sec on Llama2-13B with TensorRT-LLMgithub.comrihard7854@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square23fedilink
minus-squareHerr_Drosselmeyer@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoObviously. There aren’t many people in the world with 50k burning a hole in their pockets and of those, even fewer are nerdy enough to want to set up their own AI server in their basement just for themselves to tinker with.
Obviously. There aren’t many people in the world with 50k burning a hole in their pockets and of those, even fewer are nerdy enough to want to set up their own AI server in their basement just for themselves to tinker with.