Hmm, will have to check this stuff with the people on the rwkv discord server.
V5 is stable at context usage, and V6 is trying to get better at using the context, so we might see improvement on this
Um The dataset is opensource, its all public HF datasets
Thats the point of rwkv, you could have a 10 mil contx len and it would be the same as 100 ctx len
Its trained on 100+ languages, the focus is multilingual
Also AWQ has entire engines for efficieny, look into aphrodite engine, supposably the fastest for awq
“Do I need to learn llama.cpp or C++ to deploy models using llama-cpp-python library?” No its pure python
it outputs the call
OpenHermes 2.5 is amazing from what I’ve seen. it can call functions, summarize text, is extremely competitive, all the works
There are plenty of datasets, Just take the ones meant for stable diff training, rip out the prompt text, profit
Heres some high quality captions used for dalle3, etc:
https://huggingface.co/datasets/laion/dalle-3-dataset https://huggingface.co/datasets/laion/gpt4v-dataset https://huggingface.co/datasets/laion/wuerstchen-dataset https://huggingface.co/datasets/laion/220k-GPT4Vision-captions-from-LIVIS https://huggingface.co/datasets/laion/gpt4v-emotion-dataset
RWKV v5 7b, its only half trained rn, but the model surpasses Mistral on all multilingual benchmarks, cause the is meant to be multilingual.
OpenHermes 2.5 is the latest version, but the openHermes series has a history in ai models of being good, and I used it for some function calling, its really good
IMPORTANT!
this isnt trained, its another mistral finetune, with dpo, but with slimorca, not ultrachat.
I would be using openHermes, its much more trialed, and its proven solid
Sad, and we thought HF was the harbor of unaligned models, but maybe im missing the whole story. Hopefully they dont kill models for saying taiwan good or something
Open source -> Mistral instruct worked great for me, Zephyr alpha was crazy aligned, while beta was better
Closed Source -> Inflections Pi is smooth! Pray for API access
“I want to chat with a PDF, I don’t care for my LLM to speak French, be able to write Python or know that Benjamin Franklin wrote a paper on flatuence (all things RWKV v5 World 1.5B knows).”
This is Prime RAG, bring snippets in, make the model use them. The more knowledge the model has, the better it gets for your usecase as well, as it knows more stuff.
Also, nice using rwkv v5, hows it work for you?
there are ggufs, check the bloke or greensky
RWKV 1.5B, its Sota for its size, outperforms tinyLlama, and uses no extra vram for fitting its whole ctx len in browser.
Noice man
Well the 5 million was just an example of the OP stuff out there
No its Victorian era frankenstein obvs