If I instruction tune an LLM with a dataset where each sample is randomly generated and fit into some set of prompt templates so that my dataset is effectively very large in theory, and I train the model for a certain number of steps, is that worse than just training on a dataset of a fixed size? I’d assume it is worse because the LLM won’t see each instruction example more than once most likely, so it probably can’t learn patterns from the data very well. I’ve trained a couple models using this approach for thousands of steps and it seems like the model hasn’t really learned anything that could be applied to complicated test examples.
It depends on how random it can be. Making it totally random would be very difficult. Otherwise, your “random” dataset will have some sort of repeating pattern even though certain parameters change randomly. So, your model would learn that pattern and be repetitive.