pull down to refresh
@optimism
292,890 sats stacked
stacking since: #879734longest cowboy streak: 68npub13wvyk...hhes6rk47y
20 sats \ 1 reply \ @optimism 26m \ on: Does the US have Dutch disease? econ
Archived link: https://archive.is/WQyjX
This actually matches my own experience very close to 100%: static pipelines work much much better than orchestrators for specific automation goals.
I think this is because the orchestrator is extremely vulnerable as every little issue gets amplified in the overall flow. Hallucination, poisoned context (or context overflow!), and even token triggers from feedback at the central point will get exaggerated throughout the flow. The risk of strange results multiplies with each call. So calls should be kept to a minimum.
I'd add (3), also see #1027214 for an instance where this allegedly happened (tbh I don't really believe the narrative/excuse given that it happened the way it was said): lawyer gave a draft to a bot and the bot hallucinated cases. This apparently happens a lot, #1034753
PS: that the most expensive professionals in the universe partially (or fully) outsource their work to a chatbot is truly appalling to me; but maybe these lawyers all charge normal rates under $150/hr instead of the $2500/hr I'm used to paying.
Makes one wonder... wen slaughterbots
Last year they made a 53M negative cashflow from operating activities (i.e. the software business) according to their 10-K filing for 2024.
So do they really make a profit from the software business?
Found it w/
dlp
:This video is available in Australia, Canada, United Kingdom, Indonesia, India, Kenya, Malaysia, Netherlands, New Zealand, Philippines, Sweden, United States, South Africa.
Yes. See the difference of using it as a tool (this), and as a lazy thing where you don't read what it does (the original)?
Correct. The biases are taken out with reinforcement training. This used to be a human check but is now simply another model checking the answers: bias is currently second hand, and the bias check itself is also subject to hallucination.
I know how to automate it with AI, but, it'd be one or two weeks work because you'd want NLP combined with inverse chatbot and then work the word distance math, not chatbot spitting out bs. I'm not ready to spend that kind of time on the process yet.
Right now, I yolo'd a script to parse text from the index (where I just c&p the index entries I like), then I look up the post ID and i throw it all in a spreadsheet. Because of this manual process I get to read/bookmark every SN post I like other than the AI ones too, because I look at every title of every post of the entire week right now. So I get personal added value from doing it manually; besides that it keeps me sane to have something to not automate.
What I find intriguing is that the Italians are at the forefront of championing less regulation. That used to be the UK's and smaller more libertarian countries' roles. Keeping the FrancoGermanic love affair in check may be the the most important challenge that Europe faces.
The real problem at hand is how much trust people place in the answers they receive when working with an LLM. Maybe the best outcome is that we all get seeded with a very strong distrust of LLM outputs -- at least enough trust to check our answers once in a while.
I think that the outrage is an important counterweight to the exaggerated claims from all the LLM bosses. They just spent billions towards something both great (from a big data aggregation / achievement perspective) and mediocre (from a usability / advertised use-case fitting perspective) at the same time, and they need to reinforce the success to get even more billions to improve the latter by any means possible.
Because both traditional news and social media is saturated with the billionaires and not the boring real research, or even the yolo resulting in "hey we found something interesting" "research", the world only gets to hear the banter. I'd suggest that the outrage is even too little because which player has been decimated thus far? None. They all get billions more and then thus far, they spend it on the next model that is still mediocre, because there are no real breakthroughs (also see #1020821).
If more weight were to be given to what goes wrong, the money will potentially be spent on real improvement, not more tuning and reiterations with more input. As long as that's not the case and large scale parasitic chatbot corporations can continue to iterate on subprime results, we'll be stuck with hallucinating fake-it-till-you-make-it AI that is not fit for purpose.