looks like you've done some thorough testing. Have you found that prompting reli...

dataviz1000 · 2026-04-15T04:55:25 1776228925

Because these are probabilistic machines, they solve the same problem at a predictable rate. Even with different variables, the success rate stays consistent.

I only noticed the premature quitting issue recently and haven't tested it much yet. It's getting expensive to run Sonnet on hard multiplication problems. I let it run to 200k tokens and it still grinds without quitting.

But Opus has a different problem. Ask it to solve a Rubik's Cube and it will run for hours and never solve it. So there are definitely prompts that make it run forever. But if you tell it to break down multiplication using algorithms, it behaves differently. It can take really complicated calculus problems and break them into simpler ones. I can't stump it that way.

Here's the interesting thing. Even when Opus solves modular expressions by breaking them down like calculus, it still fails at a predictable rate. There's a constant failure rate no matter what you do at any level of complexity.

Models have a baseline failure rate that prompting can't change. You can change how they fail -- token burn or quitting early -- but the underlying limit stays the same.