pull down to refresh

Do you do that in the same session?

So what I find is that if I say review <xyz> for logic errors and omissions in a new session (I work git-based with all models though, so there is some indirection here too that may be helpful in triggering a different pattern through the layers) I can be sure that Claude finds a lot of stuff it missed the first run. (Hate the apologies tho, wtf Anthropic).

However, I do admit that second-model review works better. I think k00b was experiencing the same by mixing GPT and Claude for reviews.

I might do a write up of my process. It was an iterative loop along these lines:

  1. Start a new session (no memory)
  2. Upload current version of model and (incomplete) proof
  3. Ask it to complete the proof
  4. Evaluate the AI's proof completion
  5. Poke the AI on parts that I found incorrect or dissatisfying
  6. Iterate within the chat on why that part was hard or dissatisfying
    • If simply a mistake by the AI, fix it
    • If model setup is genuinely problematic, revise assumptions
  7. Update the model with new assumptions as appropriate
  8. Go to step 1
reply

For step 5, is there a way to make it come to that conclusion?

Not sure if it's 100% comparable but maybe there's something close to it: if I find a thing in code, instead of "arguing" or "pressing" I just say: write a test around <xyz> in case there are any issues and then it 90% of the time finds its own error.

reply