@ShitGobbler69 - Power User

0 Posts
1 Comment

Joined 2 years ago

Cake day: November 24th, 2023

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

ShitGobbler69@alien.topBtoLocalLLaMA•Running full Falcon-180B under budget constraint
link
fedilink
English
arrow-up
1·
2 years ago
FYI if all you’re using it for is benchmarking (not like chat mode) you can probably do it in way less VRAM. You can load 1 layer into VRAM, process the entire set of input tokens, remember that output, load another layer into vram, repeat.

link
fedilink