Edit: Compare the grammar and spelling of properly uncensored models with censored ones, when it comes to “gendered” grammar and natural language concepts. There is a huge difference.
That’s not the question of language, that’s the result of promoting diversity through censorship, and the worst part is, none of the big companies give a faintest whiff of a fuck about diversity or minorities, they are just crossing things off of their marketing bucket list.
Censorship causes massive performance degradation. If you mindlessly force a model to have a bias, no matter the subject, the model will propagate this bias throughout its entire base knowledge.
Good or bad, a bias is a bias, because we are talking about computers, which are deterministic and literal and computer code only ever takes things literally, and LLMs are barely the first step towards generalisation and unassisted extrapolation.
Even when the general concept of say, fighting gender discrimination, is good in its core, force-feeding that to an LLM which is a computer code after all, will do stuff like making it completely loose the concept of genders, including linguistics concepts, solely because they share the word “gender” through literature and so the training dataset.
Yes, discrimination is stupid, racism is stupid, forcing everybody to live exactly the one “right” way is stupid.
But using censorship, to fight this, is hands down the dumbest, and laziest way possible.
Edit: Compare the grammar and spelling of properly uncensored models with censored ones, when it comes to “gendered” grammar and natural language concepts. There is a huge difference.
That’s not the question of language, that’s the result of promoting diversity through censorship, and the worst part is, none of the big companies give a faintest whiff of a fuck about diversity or minorities, they are just crossing things off of their marketing bucket list.
Censorship causes massive performance degradation. If you mindlessly force a model to have a bias, no matter the subject, the model will propagate this bias throughout its entire base knowledge.
Good or bad, a bias is a bias, because we are talking about computers, which are deterministic and literal and computer code only ever takes things literally, and LLMs are barely the first step towards generalisation and unassisted extrapolation.
Even when the general concept of say, fighting gender discrimination, is good in its core, force-feeding that to an LLM which is a computer code after all, will do stuff like making it completely loose the concept of genders, including linguistics concepts, solely because they share the word “gender” through literature and so the training dataset.
Yes, discrimination is stupid, racism is stupid, forcing everybody to live exactly the one “right” way is stupid.
But using censorship, to fight this, is hands down the dumbest, and laziest way possible.