So RWKV 7b v5 is 60% trained now, saw that multilingual parts are better than mistral now, and the english capabilities are close to mistral, except for hellaswag and arc, where its a little behind. all the benchmarks are on rwkv discor, and you can google the pro/cons of rwkv, though most of them are v4.
Thoughts?
I tested the 3B model and it looks good, especially the multilingual part (demo https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-2)
Well it seems a lot better at Slovenian than LLamas or Mistral, especially for a 3B model, although it mostly just rambles about stuff that’s vaguely related to the prompt and makes lots of grammatical mistakes. The 7B one ought to be interesting once it’s done.
Its trained on 100+ languages, the focus is multilingual
Will that make it a good translator? I remember seeing somewhere a 400+ language translation model but not an LLM somewhere. Wonder what the best many language open source fast high quality translation solutions might look like.
I tested it. It understands Persian, but not so well, it also hallucinates people.
and Mistral doesn’t?
keep in mind that the demo is for 3B model, and the post is about 7B, which I expect to be way better
Seems amazingly good. I might get a real use out of a raspberry pi after all.