Hi. So I am a bit new to NLP and ML as a whole and I am looking to create a text classification model. I have tried it with deBERTa and the results are decent(about 70%) but I need more accuracy. Are Generstive models a better alternative or should I stick to smaller models like Bert or maybe even non-NN classifiers and work on better dataset quality?
Maybe it’s overkill, idk, but if you want higher accuracy, it’s an option
You can just list examples from your dataset and let the LLM complete the last one
Example:
I don’t know about the task you have in mind specifically, but you can do just about anything with a 13B llama model. Picking a fine-tune doesn’t matter if you use examples instead of instructions. 7B Mistral seems to do fine with this example (even GPT2 can do some classification), but in-context learning is remarkably better at 13B, picking up a lot more nuance
+1, when in doubt, LLM it out.
You could also ask for explanations so when it gets it wrong, you can work on modifying your prompts/examples to get better performance.
Potentially you wouldn’t want to do this if:
My classification task is to classify a given essay into AI generated and human generated. And I need the answer to be between 0 and 1(both included) with 1 being AI generated and 0 being human generated.
Few-shot examples is a good idea for most classification tasks but I don’t think Generative LLMs can understand the more intricate semantic patterns to differentiate between the AI and human generated with just a few examples but I’ll try it once and let you know!
Btw do you think fine-tuning would be better?