Use case is that I want to create a service based on Mistral 7b that will server an internal office of 8-10 users.
I’ve been looking at modal.com, and runpod. Are there any other recommendations?
Use case is that I want to create a service based on Mistral 7b that will server an internal office of 8-10 users.
I’ve been looking at modal.com, and runpod. Are there any other recommendations?
WebAssembly based open source LLM inference (API service and local hosting): https://github.com/second-state/llama-utils
hmm cool. seems the size for the inference app only a few MBs