New Model: Starling-LM-11B-alpha-v1

perlthoughts@alien.top · 2 years ago

New Model: Starling-LM-11B-alpha-v1

ex-arman68@alien.top · 2 years ago

I think I found the key to avoid repetitions and long rambling answers, which this model has a tendency to do. Hopefully a further fine tune will reduce it. The key is to reduce creativity all the way down, and make the model deterministic. How do you do that?, you may ask. Easy, it is controlled by the following 3 inference parameters: temp, top_p, and top_k

With the following default settings I often get repetitions or additional rambling information:

    "top_k": 40,
    "top_p": 0.95,
    "temp": 0.8,

If I use the following values instead, to make the model deterministic, the problem seems to be gone:

    "top_k": 1,
    "top_p": 0.1,
    "temp": 0.1,

Please note that if you want to use the model for story writing, maybe you get better results by dialing up the creativity.

Here is my complete config file for LM Studio:

{
  "name": "OpenChat",
  "inference_params": {
    "top_k": 1,
    "top_p": 0.1,
    "temp": 0.1,
    "input_prefix": "GPT4 Correct User: ",
    "input_suffix": "&lt;|end_of_turn|>GPT4 Correct Assistant: ",
    "antiprompt": [
      "GPT4",
      "&lt;|end_of_turn|>",
      "[End of Turn]",
      "[]"
    ],
    "pre_prompt": "Below is an instruction that describes a task. Write a concise response that appropriately completes the request. Ensure all essential details are provided. Each of your statements must be unique.",
    "pre_prompt_suffix": "&lt;|end_of_turn|>",
    "pre_prompt_prefix": "GPT4 System: "
  }
}

A few words about the above:

I only include necessary options to avoid overwriting user settings when loading the model or switching prompt format. If you export a config file, please make sure you then edit it manually to clean it up.
GPT Correct User/Assistant. The Correct keyword is important. It refers to the training data, where the answers were verified as correct. If you do not use it (eg: GPT4 User), it will still works, but it will five more weight to training data which was unverified (Human User was also used)
GPT4 Sytem or just System are the 2 official recommended ways to prefix system messages. Either work.
In my system message (pre_promt), I avoid any negative (eg: No repetitions). Remember this is just a language model: if it sees the word “repeat” (or similar), it will have a tendency to see this as an instruction to create repetitions! Instead I turned it around into a positive statement based on the word “unique”.
Trailing spaces in the prefixes and suffixes are not critical, but ensure proper formatting.

As a bonus, here is my config for generating code, which according to my limited testing, this model seems to be surprisingly good at:

{
  "name": "OpenChat Code",
  "inference_params": {
    "top_k": 1,
    "top_p": 0.1,
    "temp": 0.1,
    "input_prefix": "Code User: ",
    "input_suffix": "&lt;|end_of_turn|>Code Assistant: ",
    "antiprompt": [
      "GPT4",
      "&lt;|end_of_turn|>",
      "[End of Turn]",
      "[]"
    ],
    "pre_prompt": "You are a helpful coding assistant. Respond concisely, but ensure all essential details are provided. Each of your statements must be unique.",
    "pre_prompt_suffix": "&lt;|end_of_turn|>",
    "pre_prompt_prefix": "GPT4 System: "
  }
}

ex-arman68@alien.top · 2 years ago

I have been further testing code generation, and I am impressed! It seems be almost on par with GPT4, and it can do things GPT4 cannot, like writing code for Google Apps Script. I have tried a few relatively complex tasks, in various languages (Powershell, Python, Javascript, Google Apps Script), covering various domain (Active Directory, Spotify, Mathematics). As far as I can tell, the code provided is correct!

Try the following example (make sure to load the OpenChat code prompt format I provided earlier); the result is impressive:

I need help writing code for google apps script. Could you please write a function that connects to spotify and sort a given playlist by order of popularity. Then another function to write back the sorted track to a given playlist. Make sure to handle pagination properly, in case the spotify playlist is large. Also added errors and exception handling. Include details of how to connect to spotify.

perlthoughts@alien.top · 2 years ago

not sure about that, but it is good.

extopico@alien.top · 2 years ago

How was the model size increased to 11B. It’s a merge but with what?

perlthoughts@alien.top · 2 years ago

slices:
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [0, 24]
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [8, 32]
merge_method: passthrough
dtype: float16

roselan@alien.top · 2 years ago

I got starling-11b-q8_0.gguf in LM studio and can’t get a decent output. Or I get a one liner like “here is a possible response: hello” with defaut or chatml preset, or i get pages of various “possible” responses filed with smileys most of the time (alpaca preset).

What chat format can I use?

LyPreto@alien.top · 2 years ago

I saw their 7B model closing in on gpt-4 scores in some benchmarks which is absolutely wild but also sus

perlthoughts@alien.top · 2 years ago

zuck?

shaman-warrior@alien.top · 2 years ago

Its surprising… check it out at least

perlthoughts@alien.top · 2 years ago

thanks.

perlthoughts@alien.top · 2 years ago

more bots, more bots!

perlthoughts@alien.top · 2 years ago

with itself.

extopico@alien.top · 2 years ago

you merged Starling with Starling? What merge did you use? Can you share the yaml?

perlthoughts@alien.top · 2 years ago

yes, it’s merged the same way as mistral 11b. With itself.

CheatCodesOfLife@alien.top · 2 years ago

lmao

perlthoughts@alien.top · 2 years ago

😀

perlthoughts@alien.top · 2 years ago

you can now download it with lmstudio. be sure to use the openchat prompt template.

perlthoughts@alien.top · 2 years ago

gguf files have finished uploading for all llama.cpp users.

Mother-Ad-2559@alien.top · 2 years ago

Speedy work, looking forward to test these!

perlthoughts@alien.top · 2 years ago

thanks!

perlthoughts@alien.top · 2 years ago

mergekit. there is a link in the post.

roselan@alien.top · 2 years ago

I’m just testing it casually rn in lmsys, and really like it’s flow and tone. It’s pretty pleasant to speak to from the get go, it’s a good start and I can’t wait to dig a bit harder on it.

MoneroBee@alien.top · 2 years ago

Great model. Question, do you know why it’s outputting these “<0x0A>” tokens?

For example:

Here are some ways to improve your vertical leap:**<0x0A><0x0A>**1. Strength training: Focus on exercises

LocoMod@alien.top · 2 years ago

I’m getting the same output. Those are line breaks. How odd…

perlthoughts@alien.top · 2 years ago

im on it, thanks for testing.

Creative_Bottle_3225@alien.top · 2 years ago

I tried this model a little while ago with LM Studio and I noticed that it does not have GPU acceleration. Sin

ex-arman68@alien.top · 2 years ago

I noticed the same: in LM Studio I cannot enable Apple Metal (GPU), and I get the message: “Metal acceleration is not yet supported for this model architecture (‘starcoder’)”. However, according to Activity Monitor, it fully uses the GPU when it runs. And it is very fast!

perlthoughts@alien.top · 2 years ago

I uploaded my lmstudio config file to https://huggingface.co/NurtureAI/Starling-LM-11B-alpha-v1-GGUF to help people with lmstudio setup.

CharlieBarracuda@alien.top · 2 years ago

Thank you friend, very useful! You unlocked the correct use of this amazing llm for a lot of non-tech users like me

perlthoughts@alien.top · 2 years ago

thanks, i really appreciate it.

perlthoughts@alien.top · 2 years ago

also i released more 11b mistral sizes… incase anyone is interested.

https://huggingface.co/NurtureAI/SynthIA-11B-v1.3

https://huggingface.co/NurtureAI/Mistral-11B-Instruct-v0.1

https://huggingface.co/NurtureAI/dolphin-2.2.1-mistral-11b

https://huggingface.co/NurtureAI/zephyr-11b-beta

https://huggingface.co/NurtureAI/openchat_3.5-11B
https://huggingface.co/NurtureAI/neural-chat-11b-v3-1

May the force be with you.

actualopenai@alien.top · 2 years ago

can u add this one :) https://huggingface.co/maywell/Synatra-7B-v0.3-RP

perlthoughts@alien.top · 2 years ago

for sure right after i inhale this pizza rq

actualopenai@alien.top · 2 years ago

what are ur settings for passthrough as im trying to make a 11b https://huggingface.co/maywell/Synatra-7B-v0.3-RP

perlthoughts@alien.top · 2 years ago

https://huggingface.co/maywell/Synatra-7B-v0.3-RP

https://huggingface.co/NurtureAI/Synatra-11B-v0.3-RP

perlthoughts@alien.top · 2 years ago

slices:
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [0, 24]
- sources:
- model: berkeley-nest/Starling-LM-7B-alpha
layer_range: [8, 32]
merge_method: passthrough
dtype: float16