There are tangential issues as well....we are probably already at the point where training LLMs on "non-slop" inputs is becoming difficult, at a certain point it will become impossible. No one is really clear whats going to happen to model quality at that point (models become obsessively focused on emoji and emdash communication?)
Secondly, I think of most of this like a "2nd Amendment" issue. We will all need our own open-source / self-hosted LLMs to help combat the big-gov / corp AI onslaught.
Directly reading the internet might become too difficult in say 10 years, instead your personal LLM will read it for you, to help strip out the obvious agenda-driven slop and present you with a more grounded take.
There are tangential issues as well....we are probably already at the point where training LLMs on "non-slop" inputs is becoming difficult, at a certain point it will become impossible. No one is really clear whats going to happen to model quality at that point (models become obsessively focused on emoji and emdash communication?)
Secondly, I think of most of this like a "2nd Amendment" issue. We will all need our own open-source / self-hosted LLMs to help combat the big-gov / corp AI onslaught.
Directly reading the internet might become too difficult in say 10 years, instead your personal LLM will read it for you, to help strip out the obvious agenda-driven slop and present you with a more grounded take.