- 5 Posts
- 21 Comments
vatsadev@alien.topOPBto LocalLLaMA•RWKV v5 7b, Fully Open-Source, 60% trained, approaching Mistral 7b in abilities or surpassing it.English1·2 years agoHmm, will have to check this stuff with the people on the rwkv discord server.
V5 is stable at context usage, and V6 is trying to get better at using the context, so we might see improvement on this
vatsadev@alien.topOPBto LocalLLaMA•RWKV v5 7b, Fully Open-Source, 60% trained, approaching Mistral 7b in abilities or surpassing it.English1·2 years agoUm The dataset is opensource, its all public HF datasets
vatsadev@alien.topOPBto LocalLLaMA•RWKV v5 7b, Fully Open-Source, 60% trained, approaching Mistral 7b in abilities or surpassing it.English1·2 years agoThats the point of rwkv, you could have a 10 mil contx len and it would be the same as 100 ctx len
vatsadev@alien.topOPBto LocalLLaMA•RWKV v5 7b, Fully Open-Source, 60% trained, approaching Mistral 7b in abilities or surpassing it.English1·2 years agoIts trained on 100+ languages, the focus is multilingual
vatsadev@alien.topBto LocalLLaMA•ctransformers VS llama-cpp-python which one should I use?English1·2 years agoAlso AWQ has entire engines for efficieny, look into aphrodite engine, supposably the fastest for awq
vatsadev@alien.topBto LocalLLaMA•ctransformers VS llama-cpp-python which one should I use?English1·2 years ago“Do I need to learn llama.cpp or C++ to deploy models using llama-cpp-python library?” No its pure python
it outputs the call
OpenHermes 2.5 is amazing from what I’ve seen. it can call functions, summarize text, is extremely competitive, all the works
vatsadev@alien.topBto LocalLLaMA•Is there a fine tune or dataset that focuses on creating prompts that are used in image generation like stable diffusion?English1·2 years agoThere are plenty of datasets, Just take the ones meant for stable diff training, rip out the prompt text, profit
Heres some high quality captions used for dalle3, etc:
https://huggingface.co/datasets/laion/dalle-3-dataset https://huggingface.co/datasets/laion/gpt4v-dataset https://huggingface.co/datasets/laion/wuerstchen-dataset https://huggingface.co/datasets/laion/220k-GPT4Vision-captions-from-LIVIS https://huggingface.co/datasets/laion/gpt4v-emotion-dataset
RWKV v5 7b, its only half trained rn, but the model surpasses Mistral on all multilingual benchmarks, cause the is meant to be multilingual.
OpenHermes 2.5 is the latest version, but the openHermes series has a history in ai models of being good, and I used it for some function calling, its really good
IMPORTANT!
this isnt trained, its another mistral finetune, with dpo, but with slimorca, not ultrachat.
I would be using openHermes, its much more trialed, and its proven solid
vatsadev@alien.topBto LocalLLaMA•Hugging Face Removes Singing AI Models of Xi Jinping But Not of BidenEnglish1·2 years agoSad, and we thought HF was the harbor of unaligned models, but maybe im missing the whole story. Hopefully they dont kill models for saying taiwan good or something
Open source -> Mistral instruct worked great for me, Zephyr alpha was crazy aligned, while beta was better
Closed Source -> Inflections Pi is smooth! Pray for API access
vatsadev@alien.topBto LocalLLaMA•Is there any work being done on LLMs trained on a subset of knowledge?English1·2 years ago“I want to chat with a PDF, I don’t care for my LLM to speak French, be able to write Python or know that Benjamin Franklin wrote a paper on flatuence (all things RWKV v5 World 1.5B knows).”
This is Prime RAG, bring snippets in, make the model use them. The more knowledge the model has, the better it gets for your usecase as well, as it knows more stuff.
Also, nice using rwkv v5, hows it work for you?
vatsadev@alien.topBto LocalLLaMA•TinyLlama Base Model Trained on 2T Tokens CompleteEnglish1·2 years agothere are ggufs, check the bloke or greensky
RWKV 1.5B, its Sota for its size, outperforms tinyLlama, and uses no extra vram for fitting its whole ctx len in browser.
vatsadev@alien.topOPBto LocalLLaMA•Why not test all models for training on the test data with Min-K% Prob?1·2 years agoNoice man
Well the 5 million was just an example of the OP stuff out there
No its Victorian era frankenstein obvs