From the file: "Answer is always line 1. Reasoning comes after, never before."
LLMs are autoregressive (filling in the completion of what came before), so you'd better have thinking mode on or the "reasoning" is pure confirmation bias seeded by the answer that gets locked in via the first output tokens.
Yeah this seems to be a very bad idea. Seems like the author had the right idea, but the wrong way of implementing it.
There are a few papers actually that describe how to get faster results and more economic sessions by instructing the LLM how to compress its thinking (“CCoT” is a paper that I remember, compressed chain of thought). It basically tells the model to think like “a -> b”. There’s loss in quality, though, but not too much.
For the more important sessions, I like to have it revise the plan with a generic prompt (e.g. "perform a sanity check") just so that it can take another pass on the beginning portion of the plan with the benefit of additional context that it had reasoned out by the end of the first draft.
Is this true? Non-reasoning LLMs are autoregressive. Reasoning LLMs can emit thousands of reasoning tokens before "line 1" where they write the answer.
A "reasoning" LLM is just an LLM that's been instructed or trained to start every response with some text wrapped in <BEGIN_REASONING></END_REASONING> or similar. The UI may show or obscure this part. Then when the model decides to give its "real" response, it has all that reasoning text in its context window, helping it generate a better answer.
Before that, I cited nolima (https://www.reddit.com/r/LocalLLaMA/comments/1io3hn2/nolima_...) constantly to illustrate how difficult tasks involving reasoning or multi-step information gathering degraded much faster than the needle-in-haystack benchmarks cited by the major labs. Now Chroma is the first stop. Nice job on the research!
• Neatly-formatted lists Neatness could be a sign of a machine, or it could be a sign of a diligent human author.
• Subtitles only a committee would come up with That seems to me like a matter of opinion and taste — and we all have different tastes.
• Emojis preceding every statement I counted three emoji pull quotes in a multi-page document. I suppose it could be an LLM, but it could also just be a nice style.
• Em-dashes and ‘it isn’t X, it’s Y' This is why I posted in the first place, and downvoted you. There is nothing wrong with em-dashes — I love them. I use them a lot. Frankly, I probably overuse them. I’ve used them since I was a kid: I am going to use them — and over-use them — as long as I live. As for ‘Love isn’t a feeling you wait to have — it’s a series of actions you choose to take,’ that just seems like normal English to me.
It’s very possible in 2025 that the article was LLM-written, or written by a man and cleaned up by an LLM, or written by a man and proofread by an LLM, or written by a man. It does not have the stilted feel of most LLM works to me, but I might just miss it.
An em-dash isn’t an indicator of an LLM — it’s a sign of someone who discovered typography early.
Rug pulls from foundation labs are one thing, and I agree with the dangers of relying on future breakthroughs, but the open-source state of the art is already pretty amazing. Given the broad availability of open-weight models within under 6 months of SotA (DeepSeek, Qwen, previously Llama) and strong open-source tooling such as Roo and Codex, why would you expect AI-driven engineering to regress to a worse state than what we have today? If every AI company vanished tomorrow, we'd still have powerful automation and years of efficiency gains left from consolidation of tools and standards, all runnable on a single MacBook.
The problem is the knowledge encoded in the models. It's already pretty hit and miss, hooking up a search engine (or getting human content into the context some other way, e.g. copy pasting relevant StackOverflow answers) makes all the difference.
If people stop bothering to ask and answer questions online, where will the information come from?
Logically speaking, if there's going to be a continuous need for shared Q&A (which I presume), there will be mechanisms for that. So I don't really disagree with you. It's just that having the model just isn't enough, a lot of the time. And even if this sorts itself out eventually, we might be in for some memorable times in-between two good states.
Now, to give Claude the steganogravy skill...
reply