Hello
I’m using Axolotl to fine-tune meta-llama/Llama-2-13b-chat-hf.
How should I choose the value for warmup_steps
and for val_set_size
in the config yaml file of Axolotl? In the example config files 10 warmup steps and a val set size of 0.05 is used but others also used 100 warm up steps and 0.01 or 0.02 for val set size. I have a dataset with around 3800 samples.
If your epoch is 50 steps then you are not going to use 100 warmup steps.
In Training Pro extension I use 0.1 of total steps for warmup, but max 100 (there isn’t point to go higher, after 100 steps you should have primed most of the weights)
So if you have 3800 samples, which is a ton, 100 warmup step is as good as any.
val_set_size seems to be size of evaluation data. Now it depends if you want to even use evaluation data or not (some type of training have no reason to use evaluation data as it will not evaluate anything useful) . Again with big dataset 0.04 is fine. With small dataset 0.04 will create 1 evaluation sample - you are far better not to have ANY evaluation dataset.