NeuralChat 7B: Intel’s Chat Model Trained with DPO

aminedjeghri@alien.top · 3 years ago

NeuralChat 7B: Intel’s Chat Model Trained with DPO

No_waln@alien.top · 3 years ago

Are they releasing the weights ?

aminedjeghri@alien.top · 3 years ago

yes
Intel/neural-chat-7b-v3-1 · Hugging Face

timtulloch11@alien.top · 3 years ago

Thanks Any idea what the difference is between Intel/neural-chat-7b-v3-1 and Intel/neural-chat-7b-v3? They have slightly different scores but so far I can’t see

aminedjeghri@alien.top · 3 years ago

i have no idea. A guy asked the same question 10 days ago and didn’t get a reply
Intel/neural-chat-7b-v3-1 · What is the different between Intel/neural-chat-7b-v3-1 vs Intel/neural-chat-7b-v3? (huggingface.co)

timtulloch11@alien.top · 3 years ago

Lol ty for at least replying this

justynasty@alien.top · 3 years ago

CausalLM (14B, llamafied) model is experimenting with something similar. These are the stories they can create. In my roleplay session, neural chat put me in the community of gypsy people, and it described their culture and customs like it happens in real life, this is an impressive model from Intel.

georgejrjrjr@alien.top · 3 years ago

The model seems cool and all, but the paper is better.

Intel eliminated the preference data from direct preference optimization. Preference data is expensive and collecting it is a hassle, so this is a big deal. Best of all, it looks like their no-preference DPO actually performs better.

The trick is sampling rejects from a small model. Let’s say you have a dataset of GPT-4 completions. You mark those as good (“preferred”). You prompt Llama 2 13B and mark its responses as rejects.

Tl;dr This could boost the performance of nearly every model with a minimal increase in complexity (though obviously it’s non-zero compute).

cztomsik@alien.top · 3 years ago

Thank you for the summary, that is actually very cool idea!

durden111111@alien.top · 3 years ago

I found it to be worse than openhermes 2.5. It just gives shorter, more robotic responses

julylu@alien.top · 3 years ago

same, i found it tends to give short response.

yahma@alien.top · 3 years ago

But are the short responses more correct?

Shoddy_Vegetable_115@alien.top · 3 years ago

Exactly. It didn’t hallucinate even once in my tests. I used RAG and it gave me perfect to-the-point answers. But I know most people want more verbose outputs it’s just that it’s good for factual retrieval use cases.

julylu@alien.top · 3 years ago

Maybe for RAG, short answer is less possible for hallucination？I will test more. thanks

Intel@alien.top · 3 years ago

This is a fine-tuned/instruction-tuned model. Explicit system prompts or instructions like “generate a long, detailed answer” can make the model generate longer responses. 🙂

–Kaokao, AI SW Engineer @ Intel