>We know that they do not reason because we know the algorithm behind the curtai...

drob518 · 2026-04-09T17:55:01 1775757301

Model parameters are weights, not algorithms. The LLM algorithm is (relatively) fixed: generate the next token according to the existing context, the model weights, and some randomization. That’s it. There is no more algorithm than that. The training parameters can shift the probabilities for predicting a token given the context, but there’s no more to it than that. There is no “reasoning algorithm” in the weights to converge to.

hackinthebochs · 2026-04-09T21:49:37 1775771377

This overly reductive description of LLMs misses the forest for the trees. LLMs are circuit builders, the converged parameters pick out specific paths through the network that define programs. In other words, LLMs are differentiable computers[1]. Analogous to how a CPU is configured by the program state to execute arbitrary programs, the parameters of a converged LLM configure the high level matmul sequences towards a wide range of information dynamics.

Statistics has little relevance to LLM operation. The statistics of the training corpus imparts constraints on the converged circuit dynamics, but otherwise has no representation internally to the LLM.

[1] https://x.com/karpathy/status/1582807367988654081

qsera · 2026-04-10T01:12:12 1775783532

> LLMs are circuit builders

I think they are circuit "approximators". In other words, a result of a glorified linear regression..

drob518 · 2026-04-10T18:06:26 1775844386

I called it a “big wad of linear algebra,” above. That’s all it is.

qsera · 2026-04-09T14:59:06 1775746746

https://arxiv.org/abs/2603.09678