@frontenbrecher - Power User

0 Posts
1 Comment

Joined 2 years ago

Cake day: October 31st, 2023

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

frontenbrecher@alien.topBtoLocalLLaMA•llama2 13B on Gtx 1070
link
fedilink
English
arrow-up
1·
2 years ago
use koboldcpp to split between GPU/CPU with gguf format, preferably a 4ks quantization for better speed. I am sure that it will be slow, possibly 1-2 token per second.

link
fedilink