https://huggingface.co/NurtureAI/Starling-LM-11B-alpha-v1
This is Berkeley’s model: Starling-LM-7B-alpha with the size of model increased to 11B from 7B.
Special thanks to user Undi95 for their mistral passthrough explanation with cg123’s mergekit, Berkeley of course for Starling-LM-7B-alpha, and also everyone contributing to open source AI development.
Together we are strong!
The performance of this model will increase drastically as it is further fine tuned with the newly added layers.
AWQ version and GGUF version coming soon!
I think I found the key to avoid repetitions and long rambling answers, which this model has a tendency to do. Hopefully a further fine tune will reduce it. The key is to reduce creativity all the way down, and make the model deterministic. How do you do that?, you may ask. Easy, it is controlled by the following 3 inference parameters: temp, top_p, and top_k
With the following default settings I often get repetitions or additional rambling information:
If I use the following values instead, to make the model deterministic, the problem seems to be gone:
Please note that if you want to use the model for story writing, maybe you get better results by dialing up the creativity.
Here is my complete config file for LM Studio:
A few words about the above:
As a bonus, here is my config for generating code, which according to my limited testing, this model seems to be surprisingly good at:
I have been further testing code generation, and I am impressed! It seems be almost on par with GPT4, and it can do things GPT4 cannot, like writing code for Google Apps Script. I have tried a few relatively complex tasks, in various languages (Powershell, Python, Javascript, Google Apps Script), covering various domain (Active Directory, Spotify, Mathematics). As far as I can tell, the code provided is correct!
Try the following example (make sure to load the OpenChat code prompt format I provided earlier); the result is impressive: