Power User
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
rihard7854@alien.topB to LocalLLaMAEnglish · 2 years ago

NVidia H200 achieves nearly 12,000 tokens/sec on Llama2-13B with TensorRT-LLM

github.com

external-link
message-square
23
link
fedilink
1
external-link

NVidia H200 achieves nearly 12,000 tokens/sec on Llama2-13B with TensorRT-LLM

github.com

rihard7854@alien.topB to LocalLLaMAEnglish · 2 years ago
message-square
23
link
fedilink
  • yamosin@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    H100 price is 30,000 dollars so i guess this one will be 70,000

    • FullOf_Bad_Ideas@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 years ago

      The same bench on H100 gives about 9000 tokens. And you can rent H100 for $5/h on runpod.

LocalLLaMA

localllama

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@poweruser.forum

Community to discuss about Llama, the family of large language models created by Meta AI.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 4 users / day
  • 4 users / week
  • 4 users / month
  • 4 users / 6 months
  • 1 local subscriber
  • 12 subscribers
  • 1.02K Posts
  • 5.82K Comments
  • Modlog
  • mods:
  • communick
  • BE: 0.19.11
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org