I have given llama.cpp server ui a facelift

Evening_Ad6637@alien.top · 3 years ago

I have given llama.cpp server ui a facelift

uhuge@alien.top · 3 years ago

Maybe wrong suggestion, but I got used to have /docs endpoint with description of the endpoints available, would you consider adding it too u/Evening_Ad6637?
It could point to/render https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md#api-endpoints at first, anyway seems helpful to have it served.

Evening_Ad6637@alien.top · 3 years ago

Ah one sidenote: selecting a model via dialog is absolutely not intuitive. If you want to navigate into a folder, you have to press space two times. Do not press enter until you decide to choose a specific folder. It doesnt matter that much if you are in parent folders, since the script will search recursively - but of course if you have many files it could take a long time.

ab2377@alien.top · 3 years ago

much needed change! thanks. hope they merge the change quick.

I dont know what the repo maintainers have in mind, but will they allow it to be changed to look like the oai playground, like the controls on the right and center page is completely chat etc?

Evening_Ad6637@alien.top · 3 years ago

Yes the openai playground was my styling inspiration. I thought this is good since a lot of users will used to it.

the llama.cpp dev (gerganov) already answered and accepts a merge : ))

uhuge@alien.top · 3 years ago

the values here seem off, not normalized, but I like the idea.

https://preview.redd.it/rmmx9kclh33c1.png?width=183&format=png&auto=webp&s=03a94c6b8d21dee12ab9822f4206752296e08172

ambient_temp_xeno@alien.top · 3 years ago

Does it have min-p sampling?

Evening_Ad6637@alien.top · 3 years ago

u/ambient_temp_xeno ah I have now seen that min-p has been implemented in the server anyway, so I have now added it too.

involviert@alien.top · 3 years ago

Regarding that “prediction” setting, what exactly is it? I remember n_predict from using llama.cpp directly but I think i always set it to -1 for like max. And I think I don’t even have such a setting in llama-cpp-python?

Evening_Ad6637@alien.top · 3 years ago

Yes it means predict n tokens. Is it not easy to understand? I might change it back… For me it is important that an ui is also not overbloated with “words” and unfortunately “Predict_n Tokens”… how can I say… it ‘looks’ aweful. So I am looking for something more aesthetic but also easy to understand. It’s difficult for me to find.

involviert@alien.top · 3 years ago

Oh it wasnt about your choice of words, that seems fine.

uhuge@alien.top · 3 years ago

not sure if usable, but “rounds” or “amount” seem good alternatives.

uhuge@alien.top · 3 years ago

I am not able to select and copy any text while generating. Seems like a UX bug where the selection disappears with each token streamed in.

Evening_Ad6637@alien.top · 3 years ago

thanks for your feedback. that’s strange, I couldn’t reproduce this bug (or I didn’t understand the error?)
I’ll answer you on github more detailed.

RayIsLazy@alien.top · 3 years ago

how do i run it? When I build it and run ./server it just shows the default ui

Evening_Ad6637@alien.top · 3 years ago

did you cloned it from my repo?

arthurwolf@alien.top · 3 years ago

It definitely should have a live token counter as you type the prompt in.

Do you plan to make this a PR against llama.cpp? It really deserves to be merged in.

Evening_Ad6637@alien.top · 3 years ago

That’s a pretty good idea! thanks for your input. I will definitely make a note of it as an issue in my repo and see what I can do.

Thank you for saying that. It makes me feel valued for my work. I’ve already made a pull request and Gerganov seems to like the work in general, so he would accept a merge. I still need to fix a few things here and there though - the requirements at the llama.cpp dudes are very high : D (but i don’t expect anything else there heheh)

aseichter2007@alien.top · 3 years ago

Do you happen to have on hand the docs for the api?