minus-squareKey-Comparison3261@alien.topOPBtoLocalLLaMA•Any interest in C#/.NET for serving LLMs?linkfedilinkEnglisharrow-up1·1 year agoYou have exllama, vllm, lmdeploy in python. And in most cases fastapi is used for serving an http endpoint. I wrote llm-sharp just for dropping python (GIL, pip deps) and getting flexible adaptation to dynamic model structures apart from standard llama. linkfedilink
Key-Comparison3261@alien.topB to LocalLLaMAEnglish · 1 year agoAny interest in C#/.NET for serving LLMs?plus-squaremessage-squaremessage-square3fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1message-squareAny interest in C#/.NET for serving LLMs?plus-squareKey-Comparison3261@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square3fedilink
You have exllama, vllm, lmdeploy in python. And in most cases fastapi is used for serving an http endpoint.
I wrote llm-sharp just for dropping python (GIL, pip deps) and getting flexible adaptation to dynamic model structures apart from standard llama.