I am using kobold.cpp and it couldn’t code anything outside of hello world. Am I doing something wrong?
Looks like a very small model. Maybe better for a code completion usecase.
Try Setting temperature to .1
Ive had really good luck with this model for 6.7b and and 33b. The 1.3 is more of a novelty because of how fast it runs on ancient GPUs, but not nearly as good as the other 2 sizes in my attempts, though it is amazing for its size.
It is only 1.3B :-) I have noticed that smaller models work a lot better with longer, more detailed prompts (at least 440 characters, better with twice that many).
Is that a base or some instruct-tuned fine-tune? It wouldn’t be too much out of ordinary if it’s base, they tend to get crazy. You can try setting repetition penalty to 1, might help a touch.
Also, set the temperature to 0.1 or 0.2. those two things helped me getting it to work nicely.
2 ideas
- use deepseek-coder-1.3b-instruct not the base model
- check that you use the correct prompting template for the model
It is the instruct model. You can see underneath the prompt box that it’s the deepseek-coder-1.3b-instruct_Q5_K_s model. I used the prompting template in the model, and it slightly improved answers.
But if I ask if to write some code, it almost never does and says something gibberish.
Does your GPU/CPU quality affect the AI’s output? My device is potato.