This is far from completely litigated, and even if the derivative works created by generative AI that has been trained on copyrighted material are not subject to copyright by the owners of the original works, this doesn’t mean:
companies can just use illegally obtained copyrighted works to train their AIs
companies are free to violate their contracts, either agreements they’re directly a party to, or implicitly like the instructions of robots.txt on crawl
users of the models these companies produce are free from liability for decisions made in data inclusion on models they use
So I’d say that the data question remains a critical one.
It’s already been decided that using copyrighted material in AI models is fine and not subject to copyright in multiple court cases though.
This is far from completely litigated, and even if the derivative works created by generative AI that has been trained on copyrighted material are not subject to copyright by the owners of the original works, this doesn’t mean:
So I’d say that the data question remains a critical one.