Hardware for Meta Llama2 65b for a Web App?

No-Activity-4824@alien.top · 2 years ago

Prudent-Artichoke-19@alien.top · 2 years ago

You need a load balancer of some sort but an A6000 would be a good start. 15-20 tps as a single user.

In vanilla form, Llama 2 may do silly stuff. Instructs, tuning, etc. will decrease the likelihood.

If you are taking something to prod, I’d advise picking up a consultant to work with you.

No-Activity-4824@alien.top · 2 years ago

Does it work well with other consumer graphics cards?
Is the 15-20 t/s output or input?
Regarding the fine tuning, Meta is working on it anyway, so hopefully another release at the beginning of 2024 of the same platforms but finetuned.