Saofiqlord@alien.topBtoLocalLLaMA•Is there any way to speed up the MythoMax-L2-13B on a 6GB GPU?English
1·
1 year agoYour issue is using q8. Be real, you only have 6gb of vram, not 24.
Your hardware can’t run q8 at a decent speed.
Use q4_k_s, you can offload much more to gpu. There’s degradation yes, but its not so bad.
Did you forget to unset the rope settings?
Codellama requires different rope than regular llama.
Also check your sampler settings.