Eric Hartford, the author of dolphin models, released dolphin-2.2-yi-34b.
This is one of the earliest community finetunes of the yi-34B.
yi-34B was developed by a Chinese company and they claim sota performance that are on par with gpt-3.5
HF: https://huggingface.co/ehartford/dolphin-2_2-yi-34b
Announcement: https://x.com/erhartford/status/1723940171991663088?s=20
I took a short break from my 70B tests (still working on that!) and tried TheBloke/dolphin-2_2-yi-34b-GGUF Q4_0. It instantly claimed 4th place on my list.
A 34B taking 4th place among the 13 best 70Bs! A 34B model that beats 9 70Bs (including dolphin-2.2-70B, Samantha-1.11-70B, StellarBright, Airoboros-L2-70B-3.1.2 and many others). A 34B with 16K native context!
Yeah, I’m just a little excited. I see a lot of potential with the Yi series of models and proper finetunes like Eric’s.
Haven’t done the RP tests yet, so back to testing. Will report back once I’m done with the current batch (70Bs take so damn long, and 120B even more so).
How good are the Yi models with coding?
My first test with Yi delivered a non perfect but working Tetris clone in ~3 prompts. I was very impressed, cant wait to try the Dolphin variant.
Wow i gotta try it thanks for the hype! Does the GPTQ/AWQ versions differ from GGUF in terms of context? It listed that the context is only 4096
Agreed - This is the best conversational model I have tried yet.
34B is the largest model size that I prefer running on my GPU, and this along with Nous-Capybara are fantastic.
What kind of prompt formats are you using for it? I’m downloading it now
Which is the best 70B on your list?
I’m still working on the updated 70B comparisons/tests, but right now, the top three models are still the same as in the first part of my Huge LLM Comparison/Test: 39 models tested (7B-70B + ChatGPT/GPT-4): lzlv_70B, SynthIA-70B-v1.5, chronos007-70B. Followed by dolphin-2_2-yi-34b.
SynthIA-70B-v1.5 seems to have the same context length of 2k as SynthIA-70B-v1.2, not the same 4k context length as SynthIA-70B-v1.2b
You’re right with your observation, when I load the GGUF, KoboldCpp says “n_ctx_train: 2048”. Could that be an erroneous display? Because I’ve always used v1.5 with 4K context, did all my tests with that, and it’s done so well. If it’s true, it might even be better with native context! Still, 2K just doesn’t cut it anymore, though.