I find Pascal's wager is of the same nature as Aquinas' Five Ways to prove God, or accelerationists about the inevitability of a Singularity: believing that your own rational argument can be the basis to prove a fact about reality merely because it feels internally consistent.
Needless to say, I don’t find them at all convincing. This 'nothing' is much better than catching unconvincing unneeded supernatural entities.
The Wager doesn't attempt to prove God, it merely states that you might as well worship, because the cost is small and the potential payoff is huge.
It falls apart because, based on what's actually known, there's no reason to think worshipping might be the thing that condemns you to hell, and not doing so gets you into heaven, rather than the other way around.
Not to mention that "the cost is small" is in the eye of the beholder. I've known people who spend a significant part of their week on religious activities, and that's a huge opportunity cost.
Granted, if your belief is based on Pascal's Wager, and is only to hedge your bets, presumably you wouldn't spend much time on religion. But that also raises the question of whether that style of "belief" would be good enough for whatever god might exist.
Granted^2, spending half of your life devoted to a religion could be deemed a small cost when weighed against the eternity afterward. But then you have to think about the idea that you'll have wasted half of your one and only life if the afterlife turns out not to be real.
At any rate, Pascal seems to have failed to consider that there are thousands of religions to choose from, and that a hypothetical god(s) might punish you for choosing to believe in the wrong one. And might even prefer that you believe in nothing, rather than the wrong one!
That’s the fun thing about the Wager. If the reward is infinite then any finite cost is worth it for any finite probability of obtaining it.
Pascal did actually consider other religions. He just concluded that they were definitely wrong. In his view, either (his brand of) Christianity was correct, or god doesn’t exist.
Yes, my point is that those three arguments may be compelling but they assume that reality is correlated to the shape of their thoughts. What they have in common is that they all miss the insight that you need to actually test your assumptions to improve your certainties, and that's not feasible for theoretical all powerful entities that can bend reality.
Who gets to say that the demos is fundamentally flawed? Each in-group have their own opinions on what's a flaw.
Society evolves through epiphenomena caused by the behaviour of the majority; the fact that some minorities view that evolution as 'flawed' cannot change that evolution, unless they're able to influence the majority to also see it that way.
Now, democracy is essentially a way for everybody to broadcast their views on society's flaws on non-violent ways. The alternative is that some groups broadcast their opinions in violent ways, and we have learned to see that situation as undesirable.
> Society evolves through epiphenomena caused by the behaviour of the majority
I would argue plenty of significant societal changes were caused by the behavior of relatively small number of people. Even more so when you include instances of masterful use of the butterfly effect.
> I would argue plenty of significant societal changes were caused by the behavior of relatively small number of people
Specific breaking points in history yeah, maybe. But that's possible because they're well connected people near the center of the network.
Those breakpoints are possible because either those few people share a viewpoint held by a large number of their peers, or benefit from knowledge accumulated throughout their civilization. Think how every dictator needs support from a huge following to get their power (and how easy it is to find another dictator to replace them if they die), or how often some breakthrough discoveries are made by multiple people at the same time. There's always a last straw that breaks the camel's back, but the lone wolf hardly ever gets a significant impact on society at large; they need a receptive audience to get any impact. Humans are herd animals.
Following the metaphor, the butterfly effect is only possible because a storm was brewing in the first place; the butterfly wings only decide where it will appear. Butterfly wings just don't have that much energy.
History is told from the perspective of kings, but kings can reign only within a society that believes in their divine right to rule.
You think Lenin and Mao didn't have behind them an ideology in their societies that supported them? Why did people follow their orders then, mind control?
No, I'm saying that they are not cause and effect but coevolution. Their agitprop could have such huge impact because of the conditions of workers in Zarist Russia and the Republic of China respectively. They wouldn't have worked in a different society; so no, they didn't single-handedly create the conditions for their own power, there was a previous substrate they could work on.
We're finally getting there. The model of web notebooks look a lot like Hypercard stacks in terms of usability; there's only missing someone packing them in and easy-to-use distribution and sharing environment that does not depend on users installing their own web server.
And if that package includes some reasonable local LLM model, creating simple programs by end users could be even easier than it ever was with Hypercard.
I didn't mean "like hypercard" so literally in this manner. What I meant was, a computing environment that seems to blend seamlessly into the wider operating system, and that is able to sufficiently blur the line between end users and "programmers" (here called "authors"). Critical to this capability was the ability to "pop the hood" easily and mess with what was going on underneath.
All of today's computing is fundamentally based on a strong division between programmers and users. That division has only grown more stark with time. The dominance of Unix is partly to blame, in my view.
The enduse developer experience sees the 17e and Neo paired with spoken instruction AI prompts going to the iPhone that effects the Hypercard network aware environment do the thing on the laptop.
Isn't that the same as compressing the whole book, in a special differential format that compares how the text looks from any given point before and after?
There are many ways to model how the model works in simpler terms. Next-word prediction is useful to characterize how you do inference with the model. Maximizing mutual information, compressing, gradient descent, ... are all useful characterisations of the training process.
But as stated above, next token prediction is a misleading frame for the training process. While the sampling is indeed happening 1 token at a time, due to the training process, much more is going on in the latent space where the model has its internal stream of information.
I can only speak for myself but for me, it's all about the syntax. I am terrible at recalling the exact name of all the functions in a library or parameters in an API, which really slows me down when writing code. I've also explored all kinds of programming languages in different paradigms, which makes it hard to recall the exact syntax of operators (is comparison '=' or '==' in this language? Comments are // or /*? How many parameters does this function take, and in what order...) or control structures. But I'm good at high level programming concepts, so it's easy to say what I want in technical language and let the LLM find the exact syntax and command names for me.
I guess if you specialise in maintaining a code base with a single language and a fixed set of libraries then it becomes easier to remember all the details, but for me it will always be less effort to just search the names for whatever tools I want to include in a program at any point.
I agree with a bunch of this (I'm almost exclusively doing python and bash; bash is the one I can never remember more than the basics of). I will give the caveat that I historically haven't made use of fancy IDEs with easy lookup of function names, so would semi-often be fixing "ugh I got the function name wrong" mistakes.
Similar to how you outlined multi-language vs specialist, I wonder if "full stack" vs "niche" work unspokenly underlies some of the camps of "I just trust the AI" vs "it's not saving me any time".
It is possible to try it, and some people do (high speed trading is just that, plus taking advantage of privileged information that speed provides to react before anyone else).
However there are two fundamental problems to computational predictions. The first one obviously is accuracy. A model is a compressed memorization of everything observed so far; a prediction with it is just projecting into the future the observed patterns. In a chaotic system, that goes only so far; the most regular, predictable patterns are obvious to everybody and give less return, and the chaotic system states where prediction would be more valuable are the less reliable. You cannot build a perfect oracle that would fix that.
The second problem is more insidious. Even if you were able to build a perfect oracle, acting on its predictions would become part of the system itself. That would change the outcomes, making the system behave in a different way as it was trained, and thus less reliable. If several people do it at the same time, there's no way to retrain the model to take into account the new behaviour.
There's the possibility (but not a guarantee) to reach a fixed point, that a Nash equilibrium would appear where such system becomes into a stable cycle, but that's not likely in a changing environment where everybody tries to outdo everyone else.
Ah, this actually connects a few dots for me. It helps explain why models seem to have a natural lifetime, once deployed at scale, they start interacting with and shaping the environment they were trained on. Over time, data distributions, usage patterns, and incentives shift enough that the model no longer functions as the one originally created, even if the weights themselves haven’t changed.
That also makes sense of the common perception that a model feels “decayed” right before a new release. It’s probably not that the model is getting worse, but that expectations and use cases have moved on, people push it into new regimes, and feedback loops expose mismatches between current tasks and what it was originally tuned for.
In that light, releasing a new model isn’t just about incremental improvements in architecture or scale; it’s also a reset against drift, reflexivity, and a changing world. Prediction and performance don’t disappear, but they’re transient, bounded by how long the underlying assumptions remain valid.
That means all the AI companies that "retire" a model is not because of their new better model only, but also because of decay?
PS. I clean wrote above with AI, (not native englishmen)
For some reason he doesn't like doing mathematical demonstrations so he shuns the practice of doing them, and invented a new word to describe that way of using formal systems.
Not only did it take > 5 seconds to load a page, images were progressively loaded as fast as two at a time over the next minute or so - if there were no errors during transfer!
But as a metaphor for other creative pursuits, my experience is that most of the time when people are "planning" or working on other things that they like to believe will help them do the thing... they are really just avoiding doing the thing.
People spend years doing "world-building" and writing character backgrounds and never write the damn book. Aspiring musicians spend thousands collecting instruments and never make a song.
As you say, if it's just for fun, that's all fine. But if the satisfaction you want comes from the result of the thing, you have to do the thing.
Needless to say, I don’t find them at all convincing. This 'nothing' is much better than catching unconvincing unneeded supernatural entities.
reply