Power User
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
perlthoughts@alien.topB to LocalLLaMAEnglish · 2 years ago

Brand New Mistral 16k Context Size Models got released last night from NurtureAI!

message-square
message-square
12
link
fedilink
0
message-square

Brand New Mistral 16k Context Size Models got released last night from NurtureAI!

perlthoughts@alien.topB to LocalLLaMAEnglish · 2 years ago
message-square
12
link
fedilink

In no particular order! Don’t forget to use each of their specific prompts for the best generations!

AWQ, and GGUF also available.

https://huggingface.co/NurtureAI/zephyr-7b-beta-16k
https://huggingface.co/NurtureAI/neural-chat-7b-v3-16k
https://huggingface.co/NurtureAI/neural-chat-7b-v3-1-16k
https://huggingface.co/NurtureAI/SynthIA-7B-v2.0-16k

Have fun LocalLLaMA fam <3 ! Let us know what you find! <3

alert-triangle
You must log in or register to comment.
  • perlthoughts@alien.topOPB
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    I also released chupacabra 7b awq version to get extra crispy.

  • mll59@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    First, thank you for sharing. However, I was a bit puzzled by these finetunes since many finetunes based on Mistral can simply support longer context out of the box by using NTK scaling, see here. Alas, I couldn’t find any information about what NurtureAI did to extend the context in their model cards.

    I’ve tested the NurtureAI synthia-7b-v2-16k-q8_0.gguf, using koboldcpp v1.49 using the native rope configuration of the model (which has a rope base freq of 1000000), in an existing conversation of 14971 tokens, asking it to generate a standup comedy about the preceding conversation and it produced incoherent babbling. Using the original model synthia-7b-v2.0.Q8_0.gguf (which has a rope base freq of 10000) with --ropeconfig 1.0 45000 gives me a coherent standup comedy that makes sense.

    How well this NTK scaling on Mistral-based finetunes works depends on the finetune, for some it works better than for others. For example, when I ask the original zephyr-7b-beta.Q8_0.gguf finetune, in an existing conversation of 25872 tokens, to produce a rhyming poem about the preceding conversation, the resulting poem actually mostly rhymes. Other original finetunes, like synthia-7b-v2.0.Q8_0.gguf, seem still coherent at this context size but are not able to produce rhyming poems anymore.

    Anyway, based on my experiments, these extended context models by NurtureAI do not work for me and just using NTK scaling on original Mistral-based finetunes does.

  • permalip@alien.topB
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 years ago

    I’m not sure who told who that Mistral models are only 8k or 4k. The sliding window is not the context size, it is the embedding positions that is the context size which is 32k.

    • TeamPupNSudz@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 years ago

      I’m not sure who told who that Mistral models are only 8k

      The official Mistral product information.

      Our very first foundational model: 7B parameters, fast-deployed and easily customisable. Small, yet powerful for a variety of use cases. Supports English and code, and a 8k context length. link

      Does Mistral themselves actually mention 32k anywhere?

      • permalip@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 years ago

        It has 32k, they mention it in their config “max_position_embeddings”: 32768. This is the sequence length.

        https://preview.redd.it/5r2c9592vr0c1.png?width=256&amp;format=png&amp;auto=webp&amp;s=be88f25168e3cec16cbe7f9aad15f678edf97e99

    • mcmoose1900@alien.topB
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 years ago

      But “true” 16K-32K models like MistralLite seem to perform much better at long context than the default Mistral config.

      • permalip@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 years ago

        There is nothing “true” context length about MistralLite. You are essentially removing the sliding window by doing what Amazon or Yarn is doing.

        ​

        https://preview.redd.it/rqe1hwc1vr0c1.png?width=256&amp;format=png&amp;auto=webp&amp;s=79f14a98c097d2e8fb5718ffa4d524353b059a10

  • vasileer@alien.topB
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 years ago

    is this a scam or what? none of the models above are from NurtureAI:

    - zephyr-beta is trained by HuggingFace and is 32K by default

    - neural-chat is from Intel

    - synthia is from migtissera

    Original links:

    https://huggingface.co/HuggingFaceH4/zephyr-7b-beta

    https://huggingface.co/Intel/neural-chat-7b-v3-1

    https://huggingface.co/migtissera/SynthIA-7B-v2.0

    • MugosMM@alien.topB
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 years ago

      NurtureAI extended the context size to 16k

      • vasileer@alien.topB
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 years ago

        the context was already 32K

        https://preview.redd.it/5jl7c7a53i0c1.png?width=958&amp;format=png&amp;auto=webp&amp;s=ae51ae2b52717bb5ab14bed76580e7e0a45075ed

        • MINIMAN10001@alien.topB
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 years ago

          So assuming this release does anything at all the only thing I can think of would be that instead of “hidden size” cause being 4k giving a 4k sliding window into 32k context it would be a hidden size of 16k giving a 16k window into the 32k context.

          However that’s just speculation on my part because… Otherwise the release means nothing… Which would be weird.

          • Flag_Red@alien.topB
            link
            fedilink
            English
            arrow-up
            1
            ·
            2 years ago

            That’s not what hidden size does.

LocalLLaMA

localllama

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@poweruser.forum

Community to discuss about Llama, the family of large language models created by Meta AI.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 4 users / day
  • 4 users / week
  • 4 users / month
  • 4 users / 6 months
  • 1 local subscriber
  • 11 subscribers
  • 1.02K Posts
  • 5.82K Comments
  • Modlog
  • mods:
  • communick
  • BE: 0.19.11
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org