How to achieve more than 4k context?

Doctor_Turkleton@alien.top · 2 years ago

How to achieve more than 4k context?

a_beautiful_rhind@alien.top · 2 years ago

If you never set rope base (or alpha) higher then it will just have stock context.

Rutabaga-Agitated@alien.top · 2 years ago

Does anyone have some hints how to use exllamav2 and extended context length by using GPTQ weights?

durden111111@alien.top · 2 years ago

I’m wondering too. Openhermes 2.5 works fine for me on Oobabooga but it just stops outputting any tokens once it reaches 4k context despite having everything set for 8k (I’m running GGUF offloaded to gpu).

BangkokPadang@alien.top · 2 years ago

For llama2 models set your alpha to 2.65 when loading them at 8k.

The general suggestion is “2.5” but if you plot the formula on a graph, 8192 context aligns with 2.642, so 2.65 is more accurate than 2.5