Not in my opinion, works fine in general, wrote 2500 lines of tests for me over about 30 min tonight.
To your point that this isn't trivial or universal, there's a sharp gradient that you wouldn't notice if you're just opining on it as opposed to coding against it -- ex. I've spent every waking minute since mid-December on MCP-like territory, and it still bugs me out how worse every model is than Claude at it. It sounds like you have similar experience, though, perhaps not as satisfied with Claude as I am.
To your point that this isn't trivial or universal, there's a sharp gradient that you wouldn't notice if you're just opining on it as opposed to coding against it -- ex. I've spent every waking minute since mid-December on MCP-like territory, and it still bugs me out how worse every model is than Claude at it. It sounds like you have similar experience, though, perhaps not as satisfied with Claude as I am.