"Base" models were actually trained with some GPT instruct datasets

Wonderful_Ad_5134@alien.top · 2 years ago

"Base" models were actually trained with some GPT instruct datasets

metaprotium@alien.top · 2 years ago

It’s almost a shame chatGPT blew up in the way that it did. “AI” became a buzzword and every company found a way to shove it into their business model. Now the future of NLP is cloudy because it’s become an ouroboros of data. I think dataset selection and cleaning will become a more important area of research. I’d be surprised if “shoving terabytes of raw webscraper data” continues being feasible in the future