Bright_Counter_2280@alien.topB to

LocalLLaMAEnglish · 1 year ago

How to run llama 2 70b on 4 GPU cluster (4x A100)

1

1

How to run llama 2 70b on 4 GPU cluster (4x A100)

Bright_Counter_2280@alien.topB to

LocalLLaMAEnglish · 1 year ago

1

I have a cluster of 4 A100 GPUs (4x80GB) and want to run meta-llama/Llama-2-70b-hf. I’m a beginner and need some guidance.

- Need a script to run the model.

- Is 4xA100 enough to run the model ? or its more than required?

Need the model for inference only.

Chat

HenkPoley@alien.topB
link
fedilink
English
arrow-up
1·
1 year ago
Apparently TensorRT-LLM is fairly good: https://twitter.com/abacaj/status/1722008290324807914