Hi folks, I have edited the llama.cpp server frontend and made it look nicer. Also added a few functions. Something I have been missing there for a long time: Templates for Prompt Formats.
Here to the github link: ++camalL
Otherwise here is a small summary:
- UI with CSS to make it look nicer and cleaner overall.
- CSS outsourced as a separate file
- Added a dropdown menu with prompt style templates
- Added a dropdown menu with system prompts
- Prompt Styles and System Prompts are separate files, so editing is very easy.
- Created a script that uses “dialog” to compose the command for the server.
- Script offers the possibility to save and load configs
In planning or already started:
- WIP Multilingual: You will be able to select the language from a dropdown menu. So far language files only for English and German. (concerns UI elements and system prompts).
- Dark Mode
- Templates for the values of the UI options (samplers etc.), e.g. deterministic template, creative template, balanced template etc…
- Zenity start script (like dialog, but gui)
-–
As for the prompt format templates, I just picked a few by feel. The most important are the four to which almost all others can be traced back: Alpaca, ChatML, Llama2, Vicuna.
But if you want more templates for a specific model, feel free to let me know here or on github.
As you can see on the third picture, it should now be easier for beginners to use the llama.cpp server, since a tui dialog will assist them.
Hope you like my work. Feel free to give feedback
ps: I’ve made a pull request, but for now I publish it on my own forked repo.
Maybe wrong suggestion, but I got used to have /docs endpoint with description of the endpoints available, would you consider adding it too u/Evening_Ad6637?
It could point to/render https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md#api-endpoints at first, anyway seems helpful to have it served.Ah one sidenote: selecting a model via dialog is absolutely not intuitive. If you want to navigate into a folder, you have to press space two times. Do not press enter until you decide to choose a specific folder. It doesnt matter that much if you are in parent folders, since the script will search recursively - but of course if you have many files it could take a long time.
much needed change! thanks. hope they merge the change quick.
I dont know what the repo maintainers have in mind, but will they allow it to be changed to look like the oai playground, like the controls on the right and center page is completely chat etc?
Yes the openai playground was my styling inspiration. I thought this is good since a lot of users will used to it.
the llama.cpp dev (gerganov) already answered and accepts a merge : ))
the values here seem off, not normalized, but I like the idea.
Does it have min-p sampling?
u/ambient_temp_xeno ah I have now seen that min-p has been implemented in the server anyway, so I have now added it too.
Regarding that “prediction” setting, what exactly is it? I remember n_predict from using llama.cpp directly but I think i always set it to -1 for like max. And I think I don’t even have such a setting in llama-cpp-python?
Yes it means predict n tokens. Is it not easy to understand? I might change it back… For me it is important that an ui is also not overbloated with “words” and unfortunately “Predict_n Tokens”… how can I say… it ‘looks’ aweful. So I am looking for something more aesthetic but also easy to understand. It’s difficult for me to find.
Oh it wasnt about your choice of words, that seems fine.
not sure if usable, but “rounds” or “amount” seem good alternatives.
I am not able to select and copy any text while generating. Seems like a UX bug where the selection disappears with each token streamed in.
thanks for your feedback. that’s strange, I couldn’t reproduce this bug (or I didn’t understand the error?)
I’ll answer you on github more detailed.
how do i run it? When I build it and run ./server it just shows the default ui
did you cloned it from my repo?
It definitely should have a live token counter as you type the prompt in.
Do you plan to make this a PR against llama.cpp? It really deserves to be merged in.
That’s a pretty good idea! thanks for your input. I will definitely make a note of it as an issue in my repo and see what I can do.
Thank you for saying that. It makes me feel valued for my work. I’ve already made a pull request and Gerganov seems to like the work in general, so he would accept a merge. I still need to fix a few things here and there though - the requirements at the llama.cpp dudes are very high : D (but i don’t expect anything else there heheh)
Do you happen to have on hand the docs for the api?