I like 7b but 13b like orca2 are better, no? What is the best?
The best… my favorite at the moment is Mythalion-13B. But this may already change tomorrow.
Best is subjective, however some very popular models right now are:
- Xwin Mlewd - The dark horse
- Tiefighter - new hotness
- Mythomax - old reliable
If the 13 is not fixed, it should be a fine tune of qwen-14b, but there are almost none. There is also CausalLM-14b
Which benchmarks are easiest/reliable to implement locally ? I haven’t had a chance to test any.
Related Question: What is best open source LLM in general? I’m going to guess LLAMA 2 70B?
I feel like this and similar questions like this should be revived monthly.
Yeah, I kind of thought this was an automated post, not gonna lie.
Well, it gets posted a few times a week, so it kinda is…
lol doesn’t surprise me. I’m not on here that much
We need a monthly summary, at least, but even that feels like far too long given how fast things are evolving lately. One moment, we seem to be agreed MythoMax is the bee’s knees, then suddenly we’ve got Mythalion and a bunch of REMM variants. Suddenly, we’re getting used to Mistral 7Bs giving those 13B models a run for their money, and then Yi-34B 200K and Yi-34B Chat appear out of nowhere. Decent, out-of-the-box RP mixes and fine-tunes of that surely won’t be far behind…
It feels like this has all happened in the past couple of weeks.
Don’t get me wrong, I love it, but I’m dizzy! Excited, but dizzy.makes sense. the models do get updated from time to time so a monthly check if they’ve improved or worsened helps - just look at ChatGPT.
Genuinely this, not in a shitty “ugh, we get asked this so much” way but in a “keeping a thread on the current recommended models at different sizes that gets refreshed frequently is a good idea, because there’s just so much and it’s very hard to follow sometimes for some people.”
Maybe just posted as a "whats your current 7b/13b/30b/70b model? ’ sticky type thing.
A moderator replied after I shared the idea three months ago but from what I know we’ve only got one thread from it. I still feel like having a weekly megathread (since new models come out for various reasons) would be the way to go
The plan was to keep it up as a recurring topic, but the engagement from that post, based on daily unique views and comments, was unfortunately far below what was expected. I see the feedback though, and I agree that it can be tried again.
To keep it fresh, I was thinking the sub could actively rotate two separate megathreads. One asking what model everyone is currently using for different categories and another for something else. I think it’d be great if there were themed weekly discussion posts, so if anyone has any ideas, feel free to send a modmail anytime or let me know here.
A post for this will go up soon, either this weekend or next week: “What’s your current 7B/13B/33B/70B model?” That title should be easily searchable, and if the community likes this idea, the weekly megathreads could be accumulated and linked in the sidebar.
Things move so fast I don’t mind the threads about it.
Or a leaderboard style of models so we can track it
You mean like this one? https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
Hey. That’s cool. Wasn’t aware of this one. This should be pinned on the top and referenced in the sidebar.
The only one that I can see is that now to track the community ones. If there is a newer model in the comments I don’t see all of them there.
Maybe some sort of hybrid?
That sounds cool. Vote on the model you are using the most that week.
I just wish more authors on HF would write a paragraph explaining what purpose their model is intended for rather than just listing source models names that are also lacking explanation.
I think just a weekly pinned thread “what model are you using?” is good
Mistral-7b
Xwin mlewd is really decent imo
Lzlv
since Mistral release there are (almost) no 13B models better than Mistral finetunes, and this can be seen on Open LLM Leaderboard: it is Qwen-14B and second is a Mistral finetune intel/neural-chat, and Orca-13B comes 6th
Every 7b model I tried has been worse than my fav 13b models
I’ve stuck to mistral-open-orca for my use cases. Played around with some others and they didn’t do any better than mistrial-open-orca or just flat out sucked.
Edit: The open Hermes fine tune was one of the ones that just wasn’t any better than openorca and it came down to my use cases, person preference, and response styles. So I could see that being a close alternative for some others.
By the way, one person decided to create an initiative where everyone can rate models and AI sites. It’s quite new and features are still being added, but I think this site is worth mentioning as it has potential.
It’s say ERP, but actually even right now you can vote if it’s good at story writing or generally roleplay. Besides, you also can leave a small review there.
Raising a housekeeping issue:
Can we replace this question with like a pinned monthly/biweekly “survey and discussion” post (for all sizes) rather than seeing it here every other day and answering it halfheartedly until we all get sick and tired? Of course everyone wants the most efficient and cost-effective SOTA, but let’s maybe find a better way to go about it?
For me, it’s close between orca2, Openhermes-2.5-mistral-7b and LLaMA2-13B-Psyfighter2
addendum: which ones are good for writing NSFW?
Noromaid is good if you use the right settings.
what are the settings ? if you don’t mind sharing them
Slightly off-topic – I’ve been testing 13b and 7b models for awhile now… and I’m really interested if people have a good one to check out, because at least for now, I’ve settled on a 7b model that seems to work better than most other 13b models I’ve tried.
Specifically, I’ve been using OpenChat 3.5 7b (Q8 and Q4) and it’s been really good for my work so far, and punching much higher than it’s current weight class… – Much better than any of the 13b models I’ve tried. (I’m not doing any specific tests, it just seems to understand what I want better than others I’ve tried. – I’m not doing any function calling but even the 4bit 7b model is able to generate JSON as well as respond coherently.)
Note: specically using the original (non-16k) models; the 16k models seem to be borked or something?
I agree, it’s my favourite 7b model too. I use it mainly to help me with bot personalities. It’s too bad it’s not really fine-tuned for roleplay, otherwise it would be wrecking. And yes, 16k is broken for me too.
In general I think it would be nice if people tried to mix several Mistral models more often, as it was with the Mistral-11B-CC-Air-RP. Yes, it has serious problems with understanding the context and the characters go into psychosis, but if you use a small quantization (like q 5-6) and minimum P parameter, it improves the situation a bit. It’s just that apparently something went wrong when model merging. Otherwise, this model is really the most unique I’ve tried. Characters talk similarly to the early Character AI.
https://huggingface.co/TheBloke/Mistral-11B-CC-Air-RP-GGUF/tree/main?not-for-all-audiences=true