minus-squarekryptkpr@alien.topBtoLocalLLaMA•SQLCoder-34b beats GPT-4 at Text-to-SQLlinkfedilinkEnglisharrow-up1·1 year agoDeepSeek is not based on any llama training, it’s a 2T token pretrain of their own. 16k context. All this info is at the top of their model card. linkfedilink
kryptkpr@alien.topB to LocalLLaMAEnglish · 1 year agoGoLLIE: Guideline-following Large Language Model for Information Extractionplus-squarehitz-zentroa.github.ioexternal-linkmessage-square1fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkGoLLIE: Guideline-following Large Language Model for Information Extractionplus-squarehitz-zentroa.github.iokryptkpr@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square1fedilink
DeepSeek is not based on any llama training, it’s a 2T token pretrain of their own. 16k context. All this info is at the top of their model card.