Is LocalLLaMA on RunPod cheaper than Chat GPT4 for text prompts?

allun11@alien.top · 1 year ago

Is LocalLLaMA on RunPod cheaper than Chat GPT4 for text prompts?

FairSum@alien.top · 1 year ago

If you’re looking at cloud / API services, the best option is probably something like either TogetherAI or DeepInfra. TogetherAI tops out at 0.0009 / 1K for 70B models and DeepInfra tops out at 0.0007 / 1K input and 0.00095 output for 70B models. Both of those are well below Turbo and GPT4 price levels. Big caveat being this will only work if the model you want to use is up there. If it isn’t and you want to deploy / use said model, RunPod is probably the “cheapest” option, but it charges money as long as the pod is active, and it’ll burn through money very quickly. In that case, RunPod likely won’t be much, if any, cheaper than using GPT4.

tenmileswide@alien.top · 1 year ago

Depends entirely on what model you want. The llama-2 13b serverless endpoint would only cost $0.001 for that request on Runpod.

If you rent a cloud pod it’s going to cost the same per hour no matter how much or little you send to it so it’s based entirely on the number of requests you can get sent to it.

hudimudi@alien.top · 1 year ago

Can’t you use ChatGPT 3.5 for free? It would be the cheapest option and would surely beat any 70b model you can find on random websites.

DarthNebo@alien.top · 1 year ago

Try HuggingFace Endpoints with any of the cheap T4 based serverless instances these go to sleep as well in 15mins.