I’m fascinated by the whole ecosystem popping up around llama and local LLMs. I’m also curious what everyone here is up to with the models they are running.
Why are you interested in running local models? What are you doing with them?
Secondarily, how are you running your models? Are you truly running them on a local hardware or on a cloud service?
I’m slowly working on a change to Home Assistant (https://www.home-assistant.io/) to take the OpenAI conversation addon that they have and make it support connecting to any base url. Along with that I’m going to make some more addons for other inference servers (particularly koboldcpp, exllamav2, and text-gen-webui) so that with all their new voice work this year I can plug things in and have a conversation with my smart home and other data that I provide it.