SQLCoder-34b beats GPT-4 at Text-to-SQL

tail-recursion@alien.top · 2 years ago

SQLCoder-34b beats GPT-4 at Text-to-SQL

SomeOddCodeGuy@alien.top · 2 years ago

Deepseek-coder-33b-instruct is current open source coding SOTA?

I’m interested to know how people are using it, because someone mentioned that it was a Llama 1 with 2048 context, and generally for development I need a lot more than that. I’m not sure how I’d be able to make use of a 2048 context coding model. :(

kryptkpr@alien.top · 2 years ago

DeepSeek is not based on any llama training, it’s a 2T token pretrain of their own. 16k context. All this info is at the top of their model card.

SomeOddCodeGuy@alien.top · 2 years ago

Ahhh I apologize, I didn’t realize that it’s a totally new base model, and not built off of any Llama base. I could have sworn I read that it was, but it would appear I was incorrect.

Thanks!

SQLCoder-34b beats GPT-4 at Text-to-SQL

SQLCoder-34b beats GPT-4 at Text-to-SQL

Results on novel datasets not seen in training