rihard7854@alien.topB to LocalLLaMAEnglish · 1 year agoNVidia H200 achieves nearly 12,000 tokens/sec on Llama2-13B with TensorRT-LLMgithub.comexternal-linkmessage-square23fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkNVidia H200 achieves nearly 12,000 tokens/sec on Llama2-13B with TensorRT-LLMgithub.comrihard7854@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square23fedilink
minus-squareMeMyself_And_Whateva@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoIf it’s a version available for a tenth of the price, I could settle for 1,200 t/s without problems.
If it’s a version available for a tenth of the price, I could settle for 1,200 t/s without problems.