Have changed from llama-cpp-python[server] to llama.cpp server, working great with OAI API calls, except multimodal which is not working. Patched it with one line and voilà, works like a charm!
Have changed from llama-cpp-python[server] to llama.cpp server, working great with OAI API calls, except multimodal which is not working. Patched it with one line and voilà, works like a charm!
No, nothing I am working on or will be working on will go to any uncontrolled whereabouts. Period. Besides, it’ll get banned immediately anyway, so why bother lol
Misread title as “… thanks to F FFS” 😂 Joke aside, we should be seeing a 10-20x speed up for the current gen AI including text, sound and images. That would be a pivotal moment for gen AI.
From my experience, give it a lot, I mean really a lot, of consistent examples in the prompt how you wish it to respond in different kinds of situations, including the tones and styles. It would work. Try to give it at least 10 to 20 examples for each scenario. Consistency is the key.
Already raised an issue, couldn’t create a PR as I’m with my phone only. Solution also included:
https://github.com/ggerganov/llama.cpp/issues/4245