When training an LLM how do you decide to use a 7b, 30b, 120b, etc model (assuming you can run them all)?

paradigm11235@alien.top · 2 years ago

When training an LLM how do you decide to use a 7b, 30b, 120b, etc model (assuming you can run them all)?

creaturefeature16@alien.top · 2 years ago

I have a similar question as OP. What if you wanted to train a model specifically on coding? And even more specifically in say, just a particular library?

CKtalon@alien.top · 2 years ago

You are probably talking about fine tuning then (pre)training a model. There are models that were trained for coding like codellama and all the variants. You could probably train on the library’s code but I doubt you will get much out of it. Perhaps the best way is to create some instruction data based on the library (either manually or synthetic) and fine tune on that.

paradigm11235@alien.top · 2 years ago

I’m glad I goofed in my question because your response was super helpful, but I now realize I was missing the terminology when I posted. I was talking about fine tuning an existing model with a specific goal in mind, (re: poetry)