In no particular order! Don’t forget to use each of their specific prompts for the best generations!
AWQ, and GGUF also available.
https://huggingface.co/NurtureAI/zephyr-7b-beta-16k
https://huggingface.co/NurtureAI/neural-chat-7b-v3-16k
https://huggingface.co/NurtureAI/neural-chat-7b-v3-1-16k
https://huggingface.co/NurtureAI/SynthIA-7B-v2.0-16k
Have fun LocalLLaMA fam <3 ! Let us know what you find! <3
But “true” 16K-32K models like MistralLite seem to perform much better at long context than the default Mistral config.
There is nothing “true” context length about MistralLite. You are essentially removing the sliding window by doing what Amazon or Yarn is doing.
https://preview.redd.it/rqe1hwc1vr0c1.png?width=256&format=png&auto=webp&s=79f14a98c097d2e8fb5718ffa4d524353b059a10