pull down to refresh

In a year where lofty promises collided with inconvenient research, would-be oracles became software tools.

Following two years of immense hype in 2023 and 2024, this year felt more like a settling-in period for the LLM-based token prediction industry. After more than two years of public fretting over AI models as future threats to human civilization or the seedlings of future gods, it’s starting to look like hype is giving way to pragmatism: Today’s AI can be very useful, but it’s also clearly imperfect and prone to mistakes.

That view isn’t universal, of course. There’s a lot of money (and rhetoric) betting on a stratospheric, world-rocking trajectory for AI. But the “when” keeps getting pushed back, and that’s because nearly everyone agrees that more significant technical breakthroughs are required. The original, lofty claims that we’re on the verge of artificial general intelligence (AGI) or superintelligence (ASI) have not disappeared. Still, there’s a growing awareness that such proclaimations are perhaps best viewed as venture capital marketing. And every commercial foundational model builder out there has to grapple with the reality that, if they’re going to make money now, they have to sell practical AI-powered solutions that perform as reliable tools.

...
25 sats \ 0 replies \ @freetx 3h

I use AI fairly extensively, and I self host some models, but I'm not really an expert.

My take:

  • The advances of SOTA models have achieved nearly all of the functionality gains you are going to see....from here on out they are going to be diminishing set of returns for their future capabilities.
  • Simultaneously, due to newer training / quant methods, you are going to see smaller models that can more or less achieve some level of parity with SOTA models.
  • The big big growth is going to be in small, hyper focused domain specific models. Your doorbell will run faceNet, your coffee maker will have a tiny model in it to optimize temp, flow, etc....Almost all of these will not be "LLMs" but will be vision style models that are doing classification....but they will all link into LLM APIs to make use of generative features so your doorbell will text you a daily update like: "Suzy came in at 3pm with groceries, a UPS delivery guy came at 3:30pm, Suzy left to walk the dog at 4 and came back 15 mins later, while gone an unknown person rang door bell, here is his pic...."
reply