I use C#. Initially I’d gone all out trying to wrap Llama.cpp myself, but I was getting outdated in a matter of weeks and it was going to take a ton of effort to keep up.
So instead I run a local ooba server and use the api. I get to do all my business logic in nice, structured C#, while all the python stuff says in ooba and I don’t have to dig into it really at all.
I’m not doing that but my guess is it’s fun, easy, and cheap to do (only $8/mo!) and potentially lucrative if you can cheese a following somehow.
Using gpt is really lazy though when it’s so easy to do a custom 13B Lora that will actually interact like a human