Power User
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
perlthoughts@alien.topB to LocalLLaMAEnglish · 2 years ago

Brand New Mistral 16k Context Size Models got released last night from NurtureAI!

message-square
message-square
12
link
fedilink
0
message-square

Brand New Mistral 16k Context Size Models got released last night from NurtureAI!

perlthoughts@alien.topB to LocalLLaMAEnglish · 2 years ago
message-square
12
link
fedilink

In no particular order! Don’t forget to use each of their specific prompts for the best generations!

AWQ, and GGUF also available.

https://huggingface.co/NurtureAI/zephyr-7b-beta-16k
https://huggingface.co/NurtureAI/neural-chat-7b-v3-16k
https://huggingface.co/NurtureAI/neural-chat-7b-v3-1-16k
https://huggingface.co/NurtureAI/SynthIA-7B-v2.0-16k

Have fun LocalLLaMA fam <3 ! Let us know what you find! <3

  • permalip@alien.topB
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 years ago

    I’m not sure who told who that Mistral models are only 8k or 4k. The sliding window is not the context size, it is the embedding positions that is the context size which is 32k.

    • TeamPupNSudz@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 years ago

      I’m not sure who told who that Mistral models are only 8k

      The official Mistral product information.

      Our very first foundational model: 7B parameters, fast-deployed and easily customisable. Small, yet powerful for a variety of use cases. Supports English and code, and a 8k context length. link

      Does Mistral themselves actually mention 32k anywhere?

      • permalip@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 years ago

        It has 32k, they mention it in their config “max_position_embeddings”: 32768. This is the sequence length.

        https://preview.redd.it/5r2c9592vr0c1.png?width=256&amp;format=png&amp;auto=webp&amp;s=be88f25168e3cec16cbe7f9aad15f678edf97e99

    • mcmoose1900@alien.topB
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 years ago

      But “true” 16K-32K models like MistralLite seem to perform much better at long context than the default Mistral config.

      • permalip@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 years ago

        There is nothing “true” context length about MistralLite. You are essentially removing the sliding window by doing what Amazon or Yarn is doing.

        ​

        https://preview.redd.it/rqe1hwc1vr0c1.png?width=256&amp;format=png&amp;auto=webp&amp;s=79f14a98c097d2e8fb5718ffa4d524353b059a10

LocalLLaMA

localllama

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@poweruser.forum

Community to discuss about Llama, the family of large language models created by Meta AI.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 4 users / day
  • 4 users / week
  • 4 users / month
  • 4 users / 6 months
  • 1 local subscriber
  • 11 subscribers
  • 1.02K Posts
  • 5.82K Comments
  • Modlog
  • mods:
  • communick
  • BE: 0.19.11
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org