I have tried to set up 3 different versions of it, TheBloke GPTQ/AWQ versions and the original deepseek-coder-6.7b-instruct .

I have tried the 34B as well.

My specs are 64GB ram, 3090Ti , i7 12700k

In AWQ I get just bugged response (“”“”“”“”“”“”“”") until max tokens,

GPTQ works much better, but all versions seem to add unnecessary * at the end of some lines.

and gives worse results than on the website (deepseek.com) Let’s say il ask for a snake game in pygame, it usually gives an unusable version, and after 5-6 tries il get somewhat working version but still il need to ask for a lot of changes.

While on the official website il get the code working on first try, without any problems.

I am using the Alpaca template with adjustment to match the deepseek version (oogabooga webui)

What can cause it? Is the website version different from the huggingface model?

  • FullOf_Bad_Ideas@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I have it (33b) running pretty well, gptq in oobabooga, rtx 3090 ti, 64GB of RAM, exllama v2 hf loader, standard alpaca template without modified system prompt. I also have the same ‘’‘’‘’‘’’ with awq version. Please share the version of gptq that you have (group size, act order). I will post exact settings I use in an hour. I don’t know how the version I have locally compares to the hosted version, but it’s pretty good. There is a simple possibility that gptq quant is destroying model’s capability and I am not noticing it but you do.

    I know it’s a stupid thing, but make sure you actually chose the instruct mode in the chat window itself, I didn’t notice those options at first and got weird results with some models, since I wasn’t using the right prompt (default one was applying, not alpaca)