1.3B with 68.29% Humaneval lol, don't behead me. Part of my project PIC (partner-in-crime)

ahm_rimer@alien.top · 3 years ago

1.3B with 68.29% Humaneval lol, don't behead me. Part of my project PIC (partner-in-crime)

naptastic@alien.top · 3 years ago

Ok, it finally downloaded and I’ve spent a few minutes with it. It keeps getting into endless pathways of jaron (e.g., “fair play make world communal environment tolerant embraces diversity embrace equity promote unity instill resilience proactive leadership” and it just goes on like that–no punctuation, no connecting words–until it reaches the token limit.) What loader and settings work best with this model?

ahm_rimer@alien.top · 3 years ago

Try the chat inference code mentioned in the model card if you’re running it on GPU. The size is good enough to test on free colab as well.

naptastic@alien.top · 3 years ago

That definitely works better. I wouldn’t trust it too far though. It just told me I can remove the first part of a file with one seek() and one truncate() call…

AfterAte@alien.top · 3 years ago

Try using the alpaca template, turn temperature down to 0.1 or 0.2 and repetition penalty to 1. I haven’t tested this yet, but those settings work for Deepseek-coder. If you’re using oobabooga, the StarChat preset works for me.

naptastic@alien.top · 3 years ago

Just selecting StarChat, it instantly became conversational. :+1:

AfterAte@alien.top · 3 years ago

Great! Have fun!