Microsoft just announced the availability of OpenAI GPT-5.5, which they are charging 30x for it. In contrast, they charge 7.5x for Claude Opus 4.6 and 1x for OpenAI GPT-5.4
Check out the token-based pricing, and compare GPT-5.5 with all other models.
I was thinking of people were to use an image…:$my_tag on the host cluster and some roughe pod on the child cluster (but same underlying physical nodes) somehow overwriting the local cached :my_tag, you could do something on the parent cluster.
But I don’t fully understand what you meant with content adressed :)
Maybe one has to ensure in the host cluster that the image pull policy is set to Always or all references to images have to be based on the shasum rather than Tags.
Sparc (not a VLIW ISA) also had rotating register windows. But ia64 had a twist on it: the register window size was dynamic and "allocated" by the callee with an alloc instruction
The only other ISA I know of that did something similar was the Am29000
The Am29000 modeled it in an interesting way though:
The register file consisted of 128 global registers but the instruction encoding allowed to specify an "indirect register index" mode where the operand register was computed from the content of gr1 plus an offset. Thus gr1 acted as a "register window stack pointer". I _think_ such a computed register index would then be used to index into a separate register file for locals (and arguments etc) but I'm not sure.
Anybody here is familiar with this quite old ISA?
(I'm really interested in the richness of the CPU design space, the history of which is fascinating)
Is there evidence that frontier models at anthropic, openai or google or whatnot are not using comparable optimizations to draw down their coats and that their markup is just higher because they can?
The auth between your app and the proxy can be scoped more easily.
For example if the proxy runs in localhost you can trust the localhost workload.
Or you can use some other kind of workload identity proof (like cloud based metadata servers). If you leak such a key no other VM can use it, because it's scoped to your VM
That's not true. Both AWS' as well as GCP's workload identity tokens are not bound to the VM. If you leak the credentials they're valid until they expire. on AWS the expiry is 6 hours (non-configurable). Even if your IAM role has a shorter expiration, the credentials assumed by the VM will always be valid for 6 hours.
That entirely depends on the location of the proxy and the extra conditions you can express. E.g. you could bind it to a source IP and have the proxy check that, or use some overlay network (like tailscale does)
My point was that you don't literally have to run the proxy on localhost in order to scope the request.
in exchange for service that presumably a) costs something to amazon to operate (so not pure 100B profit) and b) anthropic would have to spend anyway to operate their business.
so basically ...
you could view this as a kind of discount, but instead of paying less later, you get some cash now and then pay full later.
I wonder what percentage of the job space truly depends on the current edge we have over machines.
I think it's reasonable to worry that way before machines are more reliable than the average human (let alone more reliable than a highly trained human) they can pose a significant disruption to the job market which will send shockwaves throughout society
That is why we need functioning states -- free markets won't save you in such a case. Though I found it is hard to explain especially to U.S. people, who put "regulation" on par with f words :)
reply