They completely nerfed Chat GPT \ stacker news ~AI

pull down to refresh

459 sats \ 15 comments \ @SwapMarket 16 Oct AI

I use Duolingo as a litmus test. A few weeks ago sharing a screenshot to the android app was all it took to solve it. Now Chat GPT asks several questions, then thinks, then invariably fails.

Some weeks ago: https://chatgpt.com/share/68f0d7de-9a9c-8004-81ed-a834ead95967

Today (did not even reply!): https://chatgpt.com/share/68f0d7f4-7030-8004-9b81-399a6e8bf22a

How to people still use this pile of garbage?

view all related items

240 sats \ 1 reply \ @zuspotirko 16 Oct

Solve, don't ask me any questions

I told you

This "adverserial" style of evaluating an LLM is interesting, but not the best evaluation. The best evaluation for how good an LLM is is if you give your best prompt instead of antagonizing the chatbot. That's the best way how we find out what its maximum capabilities are.

0 sats \ 0 replies \ @SwapMarket OP 16 Oct

Because it asks tons of questions otherwise. My best prompt had always been to just share the screenshot, and that worked great in the past. Now it just antagonizes me with its stupidity, laziness and lies.

90 sats \ 6 replies \ @optimism 16 Oct

The image on the second one doesn't show for me. A bug perhaps?

100 sats \ 5 replies \ @SwapMarket OP 16 Oct

Yep, they no longer store it in the app history. This was the screen:

0 sats \ 4 replies \ @optimism 16 Oct

Right. So I'd guess it's either a bug, or a filter to specifically stop people from using gpt to solve Duolingo.

222 sats \ 3 replies \ @SwapMarket OP 16 Oct

Unlikely it's a filter. There is no benefit whatsoever to cheat on Duolingo. You pay to learn.

It's a new model that prefers to engage in conversations rather than do what it is told. They also reduced the number of picture uploads per day from 3 to 1 on a free account. Push people to pay for this crap.

0 sats \ 0 replies \ @fourrules 16 Oct

It could be cheating on tests in general

0 sats \ 1 reply \ @optimism 16 Oct

gpt5-main (the non-thinking model) has (still unsolved, I guess they don't wanna) instruction following regressions.

Just out of interest I ran your image with the same instruction through a small gemma3 distill:

ggml-org/gemma-3-4b-it-GGUF:Q4_K_M using llama.cpp server:

I don't know if the answer is in any way correct, but this is all runnable with minimum memory (this particular one should run with 4GB memory), locally.

100 sats \ 0 replies \ @SwapMarket OP 16 Oct

It simply did the ORC of the japanese symbols

70 sats \ 3 replies \ @CruncherDefi 16 Oct

I noticed this pattern recently a lot

Me: Solve problem X ChatGPT: I did Y, would you me to also do Z (which Z is obviously the reasonable thing to do in the first place as part of solution to X) Me: Don't ask me. Just do complete the task and Z is part of solution.

0 sats \ 1 reply \ @SwapMarket OP 16 Oct

It seems they just churn the paying customers to spend API calls

100 sats \ 0 replies \ @CruncherDefi 16 Oct

Or just they optimized their model for 'user engagement' too much.

Their models always had different hooks for further conversation. Perhaps this objective was optimized too heavily in the training.

0 sats \ 0 replies \ @BlokchainB 16 Oct

So true

0 sats \ 0 replies \ @SwapMarket OP 21 Oct

Tried again that absolutely useless piece of garbage ((

https://chatgpt.com/share/68f7579b-a70c-800c-b98f-d9fd5962c899

0 sats \ 0 replies \ @anon 17 Oct

This is why I have a massive ollama model backup.