I know that vLLM and TensorRT can be used to speed up LLM inference. I tried to find other tools can be do such things similar and will compare them. Do you guys have any suggestions?

vLLM: speed up inference

TensorRT: speed up inference

DeepSpeed:speed up for training phrase

  • Dogeboja@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I’m going insane with all of these options. Someone really needs to do a comparison of them.