I tried to apply a lot of prompting techniques in 7b and 13b models. And no matter how hard I tried, there was barely any improvement.
I’ve had success with 7b Llama2 for multiple prompt scenarios. Make sure you are defining the objective clearly.
At first after reading your post, I thought you’re talking about something even smaller (phi-1/tiny llama).
what models did you try?
If anyone managed to consistently stop OpenHermes2.5 from ranting three paragraphs of fluff instead of answering a single sentence with the same content, let me know.
What you’re referring to as “prompt engineering” is more accurately described as how to get good interpolations between ChatGPT behaviors. Those are specific instructions and behaviors that OpenAI trains their models on, in careful proportion designed to reach good generalization on them
And it’s not that the models are too small – Mistral 13b will be better than gpt-3.5-turbo. It’s all about the training
Anyways that’s why I try to loudly proclaim the benefits of few-shot examples and completion instead of instruction, until we have models trained the way OpenAI’s are. If you’re willing to write examples and dodge the chatbot trained behaviors, you can pretty much perform any task without need for training
Have you tried a combination of mistral-instruct & Langchain? If not can you share some sample inputs you’re having problems with
It’s definitely not useless, it’s just doesn’t understand instructions as literally as big models. Instead of asking it to write eloquently, if you play with delivering instructions in the context of what you want, it’ll better convey the meaning.
Bad: “Write like you’re in a wild west adventure.”
Good: “Yee-haw, partner! Saddle up for a rip-roarin’, gunslingin’ escapade through the untamed heart of the Wild West!”
You also can’t force it to what you want it to do. It depends on the training data. If I want it to output a short story, sometimes requesting a scenario, or book summary will give wildly different results.
I think prompt engineering is entirely for smaller models.
Try this for a prompt and go from there: “Describe yourself in detail.”
Every model will react differently to the same prompts. Smaller models might get confused with complicated prompts designed for GPT4.
Models coming from Mistral and small models fine tuned in qa or instructions, need specific instructions in question format. For example: Prompt 1. ,“Extract the name of the actor mentioned in the article below” This prompt may not have the spected results. Now if you change it to: Prompt: What’s the name of the actor actor mentioned in the article below ? You’ll get better results. So yes, prompt engeniring it’s important I small models.
I wouldn’t really consider rephrasing a question prompt engineering. but yes the way in which the model was trained will dictate the way u ask it questions and if u don’t follow the proper format the less likely u will get a response that u want.
well I believe that at its core is the process where you guide generative artificial intelligence (generative AI) solutions to generate desired output. So iteration over rephrase, prompt versioning, and of course using the proper format is essential. I’m testing some new software architectures using 3 Instances of Mistral with different tasks using output from one as input for the other and boy, Mistral is amazing.
I’ll be honest, this question and the answers here are a classic example of llm promoting. What would be very useful is some examples of what you tried and what challenges you faced with those trials so people can give more informed and targeted advice.
Most of the times issue is with prompt template, especially with the spaces ###instruction vs ### instruction etc.
Smaller models need good prompt, I tried with newer version of mistral 2.5 7B prompts work superbly on that.
„Prompt engineering“ lmao
I don’t find it useless at all. It’s just more mysterious because sometimes one line of prompt gets you the best results, and sometimes it will actually listen to much more detailed instructions… but it’s a bit of a crap shoot and trial and error.
At present the simpler and narrower the scope of the instruction the better. They cannot understand complex tasks. Extrapolate how the model thinks in general, then focus on one thing in particular and arrange your prompt accordingly.
It’s a skill issue
Generally it’s the opposite. The bigger the model the less prompt engineering required to get a satisfying output.