https://huggingface.co/NurtureAI/Starling-LM-11B-alpha-v1

This is Berkeley’s model: Starling-LM-7B-alpha with the size of model increased to 11B from 7B.
Special thanks to user Undi95 for their mistral passthrough explanation with cg123’s mergekit, Berkeley of course for Starling-LM-7B-alpha, and also everyone contributing to open source AI development.

Together we are strong!

The performance of this model will increase drastically as it is further fine tuned with the newly added layers.

AWQ version and GGUF version coming soon!

  • perlthoughts@alien.topOPB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I noticed a lot of responses about the mergekit configuration i used to copy layers of 7b model of mistral to 11b. Here is my config.yml for mergekit (link in post description):

    slices:
      - sources:
        - model: maywell/Synatra-7B-v0.3-RP
          layer_range: [0, 24]
      - sources:
        - model: maywell/Synatra-7B-v0.3-RP
          layer_range: [8, 32]
    merge_method: passthrough
    dtype: float16