reply on: Why OpenAI’s solution to AI hallucinations would kill ChatGPT tomorrow \ stacker news

pull down to refresh

1 sat \ 15 replies \ @0xbitcoiner 17 Sep 2025 \ parent \ on: Why OpenAI’s solution to AI hallucinations would kill ChatGPT tomorrow AI

I know a few people who are always super confident like that. Some of them I know well, and I get that it’s just their personality, those are the ones we usually try to bet with. But when I don’t know the person that well, I don’t do it, and most of the time I just stay quiet, even if I know they’re wrong.

When it comes to AI, though, that kind of blind confidence is actually dangerous. A lot of people are gonna trust whatever it says without question, and that can go really bad, even deadly, like it’s already happened a few times.

112 sats \ 14 replies \ @optimism 17 Sep 2025

The danger isn't the confidence; it is the trust. The same trust people put in shit they see on TV, read on FB or X, in the newspaper, or what their cousin said. This is from a time back when people's only exposure to what was going on outside their immediate circle was coming from the paper and the evening news on TV.

Somehow, a publisher's implicit integrity remains, except it doesn't exist. And since the last decade or so, this has been actively (and nowadays overtly) weaponized.

Don't believe my word on it though. I'm biased, probably wrong, and just another fool tapping keys on his keyboard. No heroes.

101 sats \ 9 replies \ @SimpleStacker 17 Sep 2025

The danger isn't the confidence; it is the trust

Agreed. I actually want the AI to be confident. One of my gripes with it is that when I ask it for coding help, it sometimes gives me 3 different implementations. I don't want 3 implementations, I want you to be opinionated on what's the best one. If I don't like it, you can trust me to ask you to re-evaluate.

So the problem isn't that AI's are too confident. The problem is that users put too much trust in the initial output.

101 sats \ 5 replies \ @k00b 17 Sep 2025

It's interesting that I prefer choosing between the 3 implementations and prefer wishy washiness. At root, I think I don't like having to (or struggle to) adjust my trust levels. I'd much rather figure something out on my own if I have to.

101 sats \ 3 replies \ @SimpleStacker 17 Sep 2025

That's interesting.

Maybe it's because my trust level in the AI is already low, so I don't expect to actually use any of its implementations (at least word for word). I'm mainly using it to get a sense of "where in the code should I be looking", and "what's the general idea for the solution?" as a quicker alternative than reading and crunching all the code in my own mind.

I'm still gonna crunch enough code to understand what's going on, so the purpose of the AI is more like "find me the best jumping off point"

1 sat \ 2 replies \ @optimism 17 Sep 2025

Do you feel that these suggestions help you understand the codebase better?

101 sats \ 1 reply \ @SimpleStacker 17 Sep 2025

Usually yes. So far they've done a decent job in finding the right parts of the code to be looking at, and their suggested solutions are usually on the right track (but usually not something you can just copy paste)

31 sats \ 0 replies \ @optimism 17 Sep 2025

That's cool. I will actually try this when I go work on some software I have never worked on before - have multiple of these on the non-immediate todo list.

101 sats \ 0 replies \ @optimism 17 Sep 2025

Yeah, I rarely use it for code except when I try something new to see what it can do. But then, I've spent 95% of my time reviewing other people's code the last decade, so for me it's not much use in production. I've tried doing AI-enhanced code review where I feed it the resulting code of a diff, but it didn't really work well for me on c++ code. I'm still a skeptic when it comes to production usage really. Maybe autocomplete, but the one in my rich-ish text editor works fine for me.

31 sats \ 2 replies \ @optimism 17 Sep 2025

Lower temperature, non-reasoning may improve here. Also ***IMPORTANT: BE CONCISE!*** at the bottom of the system prompt may work due to the horrors of chat training. Which to me is still the most ridiculous thing ever.

I still have to test InternVL 3.5 (#1194686) in coding abilities because they claim to beat Claude 3.7 with a 14b model, so I'd like to see what's what with that, when I get a moment of peace.

101 sats \ 1 reply \ @SimpleStacker 17 Sep 2025

It's funny how too much reasoning leads to lower confidence / wishy washy answers.

Very human-like behavior.

1 sat \ 0 replies \ @optimism 17 Sep 2025

"but wait" loops are even worse than "you're absolutely right"

101 sats \ 3 replies \ @0xbitcoiner 17 Sep 2025

Spot on! Regular people don't even know what an LLM is, they just see 'AI' and think it's always the truth. You know that Samsung Z Fold ad? It's got AI, and this woman films a bunch of skincare stuff, asking her phone what she should get. That's a weak ad because the AI could've just pulled its answer outta some random website.

34 sats \ 2 replies \ @optimism 17 Sep 2025

Haven't seen and can't find that ad. But that is exactly what people use AI for, right? I had fun watching the "nano banana prompts" (#1218791) from the other day - it's completely useless because you can now dress brad pitt up, photorealistically. lol. But I'm sure this is the amazing functionality we all need in our lives.

101 sats \ 1 reply \ @0xbitcoiner 17 Sep 2025

this

34 sats \ 0 replies \ @optimism 17 Sep 2025

Thanks! Yeah. This is what I'd want my meta glasses to do after I hack out all the meta stuff. (without the verbosity tho. just augment highlight it in my line of sight)