I dont have budget for hosting models on dedicated GPU, what are the alternative options or platforms that let me use Opensource models like mistral, Llamas, etc in a pay per API call basis ?
I dont have budget for hosting models on dedicated GPU, what are the alternative options or platforms that let me use Opensource models like mistral, Llamas, etc in a pay per API call basis ?
HuggingFace has inference endpoint which is private & public as needed with sleep built in