I’m currently trying to figure out where it is the cheapest to host these models and use them.

I realized that a lot of the finetunings are not available on common llm api sites, i want to use nous capybara 34b for example but the only one that offered that charged 20$/million tokens which seemed quite high, considering that i see Lama 70b for around 0.7$/million tokens.

So are there any sites where i could host custom finetunes and get similar rates to the one mentioned?

  • Kimononono@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    would a service like runpod work for you? It sells you GPU power by the hour instead of by token