Honestly the m1 is probably the cheapest solution you have , get your self LLM studio and try out a 7b_K_M model your going to struggle with anything larger then that. But that will let you get to experience what we are all playing with.
Honestly the m1 is probably the cheapest solution you have , get your self LLM studio and try out a 7b_K_M model your going to struggle with anything larger then that. But that will let you get to experience what we are all playing with.
I have only recently found the correct awnswr which is take the information and use Sparse Priming Representations (SPR) to distil the information. Next feed this text to privateGPT to use as a vector db document. Since SPR condenses the text you will be able to use more items as part of the retrieval phase.
Now query the LLM using the vector db, due to the SPR encoded text you get highly detailed and accurate results with a small LLM that is easy to run.
They generally good for single shot or low shot tasks. Eg get cliff notes , create templates . You can use vector db for informational accuracy. They struggle to keep character and context iv noticed.
Just tried it can confirm this guy knows what he is talking about ^ , pretty great model tbh
Explain your train of thinking about open Hermes and what examples do you have ?
Biggest one you can run at a usable rate , the larger models tend to have more nuance , granted some new models are challenging this notion but that’s the general way to go about it.
It’s very good
Nural chat 7b is pretty good tho this seems a bit stupid imho , your better off using the model stated before and use somthing like privateGPT to ingest the book into a vector db. Then you can effectively “talk to your books”.
Not sure your going to get somthing that small yet , nural chat 7b is the closest logically sound model which is allot better when given a vector db.
Not sure about the K but the M means medium loss of info during the quantisation phase afaik