@Nkingsy - Power User

0 Posts
3 Comments

Joined 1 year ago

Cake day: November 16th, 2023

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

Nkingsy@alien.topBtoLocalLLaMA•Quantizing 70b models to 4-bit, how much does performance degrade?
link
fedilink
English
arrow-up
1·
1 year ago
Or the more undertrained it is, the more fat can be trimmed

link
fedilink

Nkingsy@alien.topBtoLocalLLaMA•Is LLaMA-1-65B or LLaMA-2-70B more creative at storytelling ?
link
fedilink
English
arrow-up
1·
1 year ago
I think llama 1 had more interesting training data, but it can’t hold a plot too well

link
fedilink

Nkingsy@alien.topBtoLocalLLaMA•Why is Mistral-7b so capable? Any ideas re: dataset?
link
fedilink
English
arrow-up
1·
1 year ago
Trained on a larger # of tokens. All the llama models are under trained it appears, especially the 70b

link
fedilink