I have tried to set up 3 different versions of it, TheBloke GPTQ/AWQ versions and the original deepseek-coder-6.7b-instruct .
I have tried the 34B as well.
My specs are 64GB ram, 3090Ti , i7 12700k
In AWQ I get just bugged response (“”“”“”“”“”“”“”") until max tokens,
GPTQ works much better, but all versions seem to add unnecessary * at the end of some lines.
and gives worse results than on the website (deepseek.com) Let’s say il ask for a snake game in pygame, it usually gives an unusable version, and after 5-6 tries il get somewhat working version but still il need to ask for a lot of changes.
While on the official website il get the code working on first try, without any problems.
I am using the Alpaca template with adjustment to match the deepseek version (oogabooga webui)
What can cause it? Is the website version different from the huggingface model?
lol, I will stop wasting my time now - I spent roughly 3 hours today trying to get it to work :D Mostly around GGUF
With oobabooga you have to modify your requirement.txt to get the latest llama_cpp
Do a git pull ,then replace inside requirement.txt
llama_cpp_python-0.2.11 by llama_cpp_python-0.2.18
Then still in your env
pip install -r requirements.txt --upgrade
PS: even 0.2.14 gave me bad answers (start to answer then fill result with 3333333…
0.2.18 fix the issue.