pull down to refresh

This dude removed the source repo while I was writing my critique.

Short version: by both structuring a framework and providing examples to an LLM that presorted confirmation bias, and then feeding that to an LLM, the entire "research" was a backfill job for Gemini Flash. All slop. I was developing a new term for this, as it happens more and more often:

Confirmation slop is the tendency of researchers, especially those working in the field of Artificial Intelligence, to prompt an LLM in a way produce an outcome that explicitly confirms or supports one's prior beliefs, values, or decisions. This is done by building a bias into the prompt through pre-sorted frameworks and example replies. This has become common practice in prompt engineering, as this increases the chances of an acceptable output even when the LLM doesn't have enough information to come to a good response.

But that's not the worst thing here. Karpathy was encouraging others to use this slop in their LLMs research and even provided the entire thing as a markdown file you can "easily feed to your LLM". This is automated well poisoning, as the 100% slop "research" is granted legitimacy because of Karpathy putting his name on it and a gazillion reposts.

Yesterday I recommended to use search with LLMs. However, as soon as my private data/doc lake is finished, I will no longer use search. Instead, I will create a vetting/scoring pipeline (with HitL) and use that. This kind of bullshit must stay out of my decision making process.