The new chat model released by Intel is now at the top of the OpenLLM leaderboard (among the 7B models).
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
Are they releasing the weights ?
Thanks Any idea what the difference is between Intel/neural-chat-7b-v3-1 and Intel/neural-chat-7b-v3? They have slightly different scores but so far I can’t see
i have no idea. A guy asked the same question 10 days ago and didn’t get a reply
Intel/neural-chat-7b-v3-1 · What is the different between Intel/neural-chat-7b-v3-1 vs Intel/neural-chat-7b-v3? (huggingface.co)Lol ty for at least replying this
CausalLM (14B, llamafied) model is experimenting with something similar. These are the stories they can create. In my roleplay session, neural chat put me in the community of gypsy people, and it described their culture and customs like it happens in real life, this is an impressive model from Intel.
The model seems cool and all, but the paper is better.
Intel eliminated the preference data from direct preference optimization. Preference data is expensive and collecting it is a hassle, so this is a big deal. Best of all, it looks like their no-preference DPO actually performs better.
The trick is sampling rejects from a small model. Let’s say you have a dataset of GPT-4 completions. You mark those as good (“preferred”). You prompt Llama 2 13B and mark its responses as rejects.
Tl;dr This could boost the performance of nearly every model with a minimal increase in complexity (though obviously it’s non-zero compute).
Thank you for the summary, that is actually very cool idea!
I found it to be worse than openhermes 2.5. It just gives shorter, more robotic responses
same, i found it tends to give short response.
But are the short responses more correct?
Exactly. It didn’t hallucinate even once in my tests. I used RAG and it gave me perfect to-the-point answers. But I know most people want more verbose outputs it’s just that it’s good for factual retrieval use cases.
Maybe for RAG, short answer is less possible for hallucination?I will test more. thanks
This is a fine-tuned/instruction-tuned model. Explicit system prompts or instructions like “generate a long, detailed answer” can make the model generate longer responses. 🙂
–Kaokao, AI SW Engineer @ Intel