Models Megathread #2 - What models are you currently using?

Technical_Leather949@alien.top · 3 years ago

Models Megathread #2 - What models are you currently using?

ttkciar@alien.top · 3 years ago

Mostly I’m still using slightly older models, with a few slightly newer ones now:

marx-3b-v3.Q4_K_M.gguf for “fast” RAG inference,
medalpaca-13B.ggmlv3.q4_1.bin for medical research,
mistral-7b-openorca.Q4_K_M.gguf for creative writing,
NousResearch-Nous-Capybara-3B-V1.9-Q4_K_M.gguf for creative writing, and probably for giving my IRC bots conversational capabilities (a work in progress),
puddlejumper-13b-v2.Q4_K_M.gguf for physics research, questions about society and philosophy, “slow” RAG inference, and translating between English and German,
refact-1_6b-Q4_K_M.gguf as a coding copilot, for fill-in-the-middle,
rift-coder-v0-7b-gguf.git as a coding copilot when I’m writing python or trying to figure out my coworkers’ python,
scarlett-33b.ggmlv3.q4_1.bin for creative writing, though less than I used to.

I also have several models which I’ve downloaded but not yet had time to evaluate, and am downloading more as we speak (though even more slowly than usual; a couple of weeks ago my download rates from HF dropped roughly in third, and I don’t know why).

Some which seem particularly promising:

yi-34b-200k-llamafied.Q4_K_M.gguf
rocket-3b.Q4_K_M.gguf
llmware’s “bling” and “dragon” models. I’m downloading them all, though so far there are only GGUFs available for three of them. I’m particularly intrigued at the prospect of llmware-dragon-falcon-7b-v0-gguf which is tuned specifically for RAG and is supposedly “hallucination-proofed”, and llmware-bling-stable-lm-3b-4e1t-v0-gguf which might be a better IRC-bot conversational model.

Of all of these, the one I use most frequently is PuddleJumper-13B-v2.