Hi community,
i am writing my own GUI in wich i want to use a LLM completely local. The problem is i dont know how to start with the LLM.
Can someone explain to me how the first steps work to integrate/work with the LLM or does someone know some good tutorials?
The LLM is downloaded localy. Now i need to integrate a library or something? sry i could not find a lot useful/direct information about the topic.
Thank you very much in advance!
There aren’t many tutorials and there is already a fully local llm gui oobabooga
Thanks for the comments so far!
My intention was to learn about LLM and to put that into practice. I am aware of already existing projekts and have something working already. It just feels a bit overloaded or impractical.
I wanted to learn about writing an application using LLMs. My goal was to write something that I could load a set of PDFs and ask questions about them., all on my PC.
I used PySide6 (Qt for Python) for the GUI since I was pretty familiar with Qt and C++, and I’m not a fan or running apps in a browser.
I used a combination of HuggingFace APIs and Langchain to write the application itself, and ended up with a somewhat generalized application that I could load different models.
It works, maybe not perfectly, but I did accomplish what I wanted, to learn about implementing an application.
I did the same thing with Stable Diffusion models, but with just HuggingFace APIs, no Langchain.
If you want to learn how to run an llm in inference mode locally, you should code that at the command line, not waste time building a GUI app. Build a GUI app if you want to learn how to build GUI apps, or if you’ve already proven out the engine at the command line, and now need a pretty GUI around it.
If you have a GPU then I’d suggest setting up a TGI container with the correct model. If no GPU is available then use the server.cpp example in llama.cpp repository & simply invoke it from your GUI.
Text generation web UI now has an API similar to openAI that you can easily access with your own GUI. llama.cpp also has a server component, just like koboldcpp. For an easy start, I would recommend oobabooga (Text generation web UI). This does provide its own GUI, but you don’t have to use it (and it can now be switched off using an argument --nowebui). Their API is really great. The advantage is that you don’t have to worry about things like different formats, chat templates etc. yourself. Also, it is compatible with many backends, such as lllama.cpp, exllama, exllamav2 etc. Otherwise, koboldcpp is also a simple backend that is easy to set up.
Don’t start writing entire GUI. First make a simple code that loads model and does interference. There are probably 10 lines in total. You can just grab the code people have with their models (like TheBloke always post a code snippet)
Now create gui and instead of preset text add text box and a button “Go” so it can do interfernce from your text.
Boom, A GUI.
Now go from there, keep adding - dropdown list to select model, dropdown list to select instruction template…etc…
I did the same thing. Read dozens of articles, watched even more YouTube vids and leeching info on this sub I’ve created a simple chat UI using llama and I’ve also created a simple RAG setup. Ofcourse you can run out of the box ui’s, like oobabooga’s text-gen-webui or even the huggingface, but I wanted to learn python and AI at the same time :) DM me if you want to discuss stuff