I made one too, but 34B Yi output is probably better. This model is worse at 2.9bpw compared to regular Tess-M at 4.6bpw and all of the usual Yi issues like repetition are worse. I uploaded it but I find it personally lacking. Also, uploading 50B+ models to HF is seriously a pain in the ass.
I use dolphin-yi because it listens the best of the Yi finetunes, but I find myself screwing around with the settings for Yi more than most. I pick a different preset and tweak it if it starts looping itself.
I made one too, but 34B Yi output is probably better. This model is worse at 2.9bpw compared to regular Tess-M at 4.6bpw and all of the usual Yi issues like repetition are worse. I uploaded it but I find it personally lacking. Also, uploading 50B+ models to HF is seriously a pain in the ass.
https://huggingface.co/lodrick-the-lafted/Kaiju-A-57B
how does merging work with what layers to choose from what models in the merging process?
How do you make the Yi models work for you? I find them super sub par so far.
I use dolphin-yi because it listens the best of the Yi finetunes, but I find myself screwing around with the settings for Yi more than most. I pick a different preset and tweak it if it starts looping itself.