meetrais@alien.topB to

LocalLLaMAEnglish · 1 year ago

Tokens per Second

3

1

Tokens per Second

meetrais@alien.topB to

LocalLLaMAEnglish · 1 year ago

3

Hey All,

I have few doubts about method to calculate tokens per second of LLM model.

The way I calculate tokens per second of my fine-tuned models is, I put timer in my python code and calculate tokens per second. So if length of my output tokens is 20 and model took 5 seconds then tokens per second is 4. Am I using correct method or is there any other better method for this?
If tokens per second of my model is 4 on 8 GB VRAM then will it be 8 tokens per second on 16 GB VRAM?

Chat

MINIMAN10001@alien.topB
link
fedilink
English
arrow-up
1·
1 year ago
I understanding is that tokens per second typically splits into two categories the preprocessing time and the actual token generation time.

At least from what I remember from oobabooga