Messing around with Yi-34B based models (Nous-Capyabara, Dolphin 2.2) lately, I’ve been experiencing repetition in model output, where sections of previous outputs are included in later generations.
This appears to persist with both GGUF and EXL2 quants, and happens regardless of Sampling Parameters or Mirostat Tau settings.
I was wondering if anyone else has experienced similar issues with the latest finetunes, and if they were able to resolve the issue. The models appear to be very promising from Wolfram’s evaluation, so I’m wondering what error I could be making.
Currently using Text Generation Web UI with SillyTavern as a front-end, Mirostat at Tau values between 2~5, or Midnight Enigma with Rep. Penalty at 1.0.
I had a high hopes for Yi-34B chat, but when I tried it I saw it is not very good.
70B models are better (well of course), but I think even some 20B models are better.
I am having better luck with 2.4BPW EXL2 quants of 70B models from Lone_Striker lately - Euryale 1.3, LZLV, etc.
Even at the smaller quants, they are quite strong at the correct settings. Easily comparable to a 34B at Q4_K_M, from my experience.