Prompt like:
Extract the company names from the texts below and return as an array
– [“Google”, “Meta”, “Microsoft”]
Yeah man just use langchain+pydantic class/guidance lib by MS with Mistral Instruct or Zephyr & you’re golden
er Javascript?
not to be an ass but what’s wrong with extracting the keywords and then going .split() ?
By keywords you mean entities? You don’t need anything as heavy as a 7b LLM for that. I use https://www.textrazor.com/plans - - free upto 500 requests per day.
On top of what other said, make sure to include a few shot examples in your prompt, and consider using constrained decoding (ensuring you get valid json of whatever schema you provide, see pointers on how to do it with llama.cpp).
For few shotting chat models, append fake previous turns, like:
System: User: Assistant: ... User: Assistant: User:
You can do this with NER model like bert, is more fast, but is only for entitie recognition
Yeah Named Entity Recognition with BERT works very well, provided that you have a good dataset. Another limitation is that it can only handle 512 tokens
That task is called Named Entity Recognition and you can do it without training data using our library (you can use any LLM that exposes an OpenAI compatible API endpoint: https://github.com/plncmm/llmner
NLTK / beautiful soup should have some tools to do such things. ig its NER.
For the record, I wouldnt advice to use an LLM for this task. Unless you can afford to waste v memory
IMHO - I think Zephyr beats Mistral here.
Seems like a job well suited for spacy?
I use mistral-7b-openorca.q8_0 And this is my prompt
system:You are a helpful machine. Always answer with the THREE most important keywords from the information provided to you between BEGININPUT and ENDINPUT. Here is an example:\\nUser: BGEININPUT A tree is planted for each contract. Your contribution will be invested 100% sustainably! ENDINPUT\\nassistant: [contract, tree, sustainable]\\nuser:
Huggingface transformers has such models available.
Why do you need an LLM for this? Just use any NER model. It will be blazing fast and run locally.
Because let’s say you train your bert model to do this, you’ll have a specific limited class trained on a specific type of document.
It will work on wikipedia articles but not on transcripts from your local police station.
Using a llm will allow it to inherit from the wide knowledge of the llm.