Is it possible to fine tune a 33B model with 48GB vRAM?

tgredditfc@alien.top · 2 years ago

Is it possible to fine tune a 33B model with 48GB vRAM?

FullOf_Bad_Ideas@alien.top · 2 years ago

8bit? 4-bit qlora? You can train 34B models on 24GB. You might need to set up deepspeed if you want to use both, or just train on 24GB card. PSA if you are using axolotl - disabling sample packing is required to enable flash attention 2 and, otherwise flash attention will simply not be enabled. This can spare you some memory. I can train Yi-34B QLoRA with rank 16, ctx 1100 (and maybe some more) on 24GB Ampere card