Fun_Tangerine_1086@alien.topB to

LocalLLaMAEnglish · 2 years ago

Why is Mistral-7b so capable? Any ideas re: dataset?

1

Why is Mistral-7b so capable? Any ideas re: dataset?

Fun_Tangerine_1086@alien.topB to

LocalLLaMAEnglish · 2 years ago

So Mistral-7b is a pretty impressive 7B param model … but why is it so capable? Do we have any insights into its dataset? Was it trained very far beyond the scaling limit? Any attempts at open reproductions or merges to scale up # of params?

Chat

Charuru@alien.topB
link
fedilink
English
arrow-up
1·
2 years ago
The results are okay, but I’m hard-pressed to call it “very capable”. My perspective on it is that other bigger models are making mistakes they shouldn’t be making because they were “trained wrong”.