What is the best 13b right now?

ivan75@alien.top · 2 years ago

What is the best 13b right now?

Feroc@alien.top · 2 years ago

The best… my favorite at the moment is Mythalion-13B. But this may already change tomorrow.

arekku255@alien.top · 2 years ago

Best is subjective, however some very popular models right now are:

Xwin Mlewd - The dark horse
Tiefighter - new hotness
Mythomax - old reliable

MLTyrunt@alien.top · 2 years ago

If the 13 is not fixed, it should be a fine tune of qwen-14b, but there are almost none. There is also CausalLM-14b

pacman829@alien.top · 2 years ago

Which benchmarks are easiest/reliable to implement locally ? I haven’t had a chance to test any.

theaceoface@alien.top · 2 years ago

Related Question: What is best open source LLM in general? I’m going to guess LLAMA 2 70B?

reggiestered@alien.top · 2 years ago

I feel like this and similar questions like this should be revived monthly.

smile_e_face@alien.top · 2 years ago

Yeah, I kind of thought this was an automated post, not gonna lie.

TheTerrasque@alien.top · 2 years ago

Well, it gets posted a few times a week, so it kinda is…

reggiestered@alien.top · 2 years ago

lol doesn’t surprise me. I’m not on here that much

BoshiAI@alien.top · 2 years ago

We need a monthly summary, at least, but even that feels like far too long given how fast things are evolving lately. One moment, we seem to be agreed MythoMax is the bee’s knees, then suddenly we’ve got Mythalion and a bunch of REMM variants. Suddenly, we’re getting used to Mistral 7Bs giving those 13B models a run for their money, and then Yi-34B 200K and Yi-34B Chat appear out of nowhere. Decent, out-of-the-box RP mixes and fine-tunes of that surely won’t be far behind…

It feels like this has all happened in the past couple of weeks.
Don’t get me wrong, I love it, but I’m dizzy! Excited, but dizzy.

rookierook00000@alien.top · 2 years ago

makes sense. the models do get updated from time to time so a monthly check if they’ve improved or worsened helps - just look at ChatGPT.

GreenTeaBD@alien.top · 2 years ago

Genuinely this, not in a shitty “ugh, we get asked this so much” way but in a “keeping a thread on the current recommended models at different sizes that gets refreshed frequently is a good idea, because there’s just so much and it’s very hard to follow sometimes for some people.”

Maybe just posted as a "whats your current 7b/13b/30b/70b model? ’ sticky type thing.

rainbowkarin@alien.top · 2 years ago

A moderator replied after I shared the idea three months ago but from what I know we’ve only got one thread from it. I still feel like having a weekly megathread (since new models come out for various reasons) would be the way to go

Civil_Collection7267@alien.top · 2 years ago

The plan was to keep it up as a recurring topic, but the engagement from that post, based on daily unique views and comments, was unfortunately far below what was expected. I see the feedback though, and I agree that it can be tried again.

To keep it fresh, I was thinking the sub could actively rotate two separate megathreads. One asking what model everyone is currently using for different categories and another for something else. I think it’d be great if there were themed weekly discussion posts, so if anyone has any ideas, feel free to send a modmail anytime or let me know here.

A post for this will go up soon, either this weekend or next week: “What’s your current 7B/13B/33B/70B model?” That title should be easily searchable, and if the community likes this idea, the weekly megathreads could be accumulated and linked in the sidebar.

Dankmre@alien.top · 2 years ago

Things move so fast I don’t mind the threads about it.

mutatedbrain@alien.top · 2 years ago

Or a leaderboard style of models so we can track it

CalangoVelho@alien.top · 2 years ago

You mean like this one? https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

mutatedbrain@alien.top · 2 years ago

Hey. That’s cool. Wasn’t aware of this one. This should be pinned on the top and referenced in the sidebar.

The only one that I can see is that now to track the community ones. If there is a newer model in the comments I don’t see all of them there.

Maybe some sort of hybrid?

RelevantFoundation14@alien.top · 2 years ago

That sounds cool. Vote on the model you are using the most that week.

PaulCoddington@alien.top · 2 years ago

I just wish more authors on HF would write a paragraph explaining what purpose their model is intended for rather than just listing source models names that are also lacking explanation.

Kep0a@alien.top · 2 years ago

I think just a weekly pinned thread “what model are you using?” is good

_der_erlkonig_@alien.top · 2 years ago

Mistral-7b

Ketamineverslaafd@alien.top · 2 years ago

Xwin mlewd is really decent imo

WAHNFRIEDEN@alien.top · 2 years ago

Lzlv

vasileer@alien.top · 2 years ago

since Mistral release there are (almost) no 13B models better than Mistral finetunes, and this can be seen on Open LLM Leaderboard: it is Qwen-14B and second is a Mistral finetune intel/neural-chat, and Orca-13B comes 6th

https://preview.redd.it/ddmvw3un172c1.png?width=1525&format=png&auto=webp&s=d1fb52530c48ed74cfd915b273de7cc3c92e12b2

218-69@alien.top · 2 years ago

Every 7b model I tried has been worse than my fav 13b models

Byt3G33k@alien.top · 2 years ago

I’ve stuck to mistral-open-orca for my use cases. Played around with some others and they didn’t do any better than mistrial-open-orca or just flat out sucked.

Edit: The open Hermes fine tune was one of the ones that just wasn’t any better than openorca and it came down to my use cases, person preference, and response styles. So I could see that being a close alternative for some others.

Nice_Squirrel342@alien.top · 2 years ago

By the way, one person decided to create an initiative where everyone can rate models and AI sites. It’s quite new and features are still being added, but I think this site is worth mentioning as it has potential.

https://besterp.ai/s/models

It’s say ERP, but actually even right now you can vote if it’s good at story writing or generally roleplay. Besides, you also can leave a small review there.

odlicen5@alien.top · 2 years ago

Raising a housekeeping issue:

Can we replace this question with like a pinned monthly/biweekly “survey and discussion” post (for all sizes) rather than seeing it here every other day and answering it halfheartedly until we all get sick and tired? Of course everyone wants the most efficient and cost-effective SOTA, but let’s maybe find a better way to go about it?

codeprimate@alien.top · 2 years ago

For me, it’s close between orca2, Openhermes-2.5-mistral-7b and LLaMA2-13B-Psyfighter2

rookierook00000@alien.top · 2 years ago

addendum: which ones are good for writing NSFW?

SminemLives@alien.top · 2 years ago

Noromaid is good if you use the right settings.

SG14140@alien.top · 2 years ago

what are the settings ? if you don’t mind sharing them

BrainSlugs83@alien.top · 2 years ago

Slightly off-topic – I’ve been testing 13b and 7b models for awhile now… and I’m really interested if people have a good one to check out, because at least for now, I’ve settled on a 7b model that seems to work better than most other 13b models I’ve tried.

Specifically, I’ve been using OpenChat 3.5 7b (Q8 and Q4) and it’s been really good for my work so far, and punching much higher than it’s current weight class… – Much better than any of the 13b models I’ve tried. (I’m not doing any specific tests, it just seems to understand what I want better than others I’ve tried. – I’m not doing any function calling but even the 4bit 7b model is able to generate JSON as well as respond coherently.)

Note: specically using the original (non-16k) models; the 16k models seem to be borked or something?

Link: https://huggingface.co/TheBloke/openchat_3.5-GGUF

Nice_Squirrel342@alien.top · 2 years ago

I agree, it’s my favourite 7b model too. I use it mainly to help me with bot personalities. It’s too bad it’s not really fine-tuned for roleplay, otherwise it would be wrecking. And yes, 16k is broken for me too.

In general I think it would be nice if people tried to mix several Mistral models more often, as it was with the Mistral-11B-CC-Air-RP. Yes, it has serious problems with understanding the context and the characters go into psychosis, but if you use a small quantization (like q 5-6) and minimum P parameter, it improves the situation a bit. It’s just that apparently something went wrong when model merging. Otherwise, this model is really the most unique I’ve tried. Characters talk similarly to the early Character AI.

https://huggingface.co/TheBloke/Mistral-11B-CC-Air-RP-GGUF/tree/main?not-for-all-audiences=true