I would love to be able to run frontier locally, but I think the larger importance of open weight models is price accountability.
In the US with our broken system of capitalism, it’s the only way we can tether these companies to reality. Left to their own devices, I’m not convinced they would actually compete with each other on price.
Buy nobody like to talk about how “moat” building is fundamentally anti-competitive, even in name.
Funny that self proclaimed capitalists hate the system in practice. Commodity pricing is what truly terrifies them.
I'm not necessarily interested in having frontier locally. You don't need to be frontier to be a very good and useful coding agent. I agree with your point on price accountability though. Hopefully no tariff comes down on the Chinese and European open-weight models.
This is my own take, directly related to this that I posted a little while back. The one thing that I think the article missed is the geopolitical angle they’re also working:
* We need to completely deregulate these US companies so China doesn't win and take us over
* We need to heavily regulate anybody who is not following the rules that make us the de-facto winner
* This is so powerful it will take all the jobs (and therefore if you lead a company that isn't using AI, you will soon be obsolete)
* If you don't use AI, you will not be able to function in a future job
* We need to lineup an excuse to call our friends in government and turn off the open source spigot when the time is right
They have chosen fear as a motivator, and it is clearly working very well. It's easier to use fear now, while it's new and then flip the narrative once people are more familiar with it than to go the other direction. Companies are not just telling a story to hype their product, but why they alone are the ones that should be entrusted to build it.
"The race to build smarter-than-human AI is a race with no winners."
And specifically about the point on China, several people in power in China have also expressed the need to regulate AI and put international structures of governance in place to make sure it will benefit mankind:
The outcome of this is, in my opinion, the United States Government classifying and regulating LLMs as something akin to how the ATF classifies weapons, ie. requiring a license to operate an LLM (hosting), with different classifications and determinations on the relative "power" of a particular model and framework, and outright banning most open-source models, like how DIY machine guns or suppressors are banned.
Think of a standard for classifying and regulating the self-hosting of open-source models similar to how an FFL works. You can do it, but you must have all your paperwork lined up, with background checks, a valid business license, and if you forget to dot an "i" or cross a "t" the Cyber version of the ATF shows up and shoots your fucking dog.
> We need to heavily regulate anybody who is not following the rules that make us the de-facto winner
How about building a multipolar world where different parts of the world (US/China/India/EU/Africa,..) get to build sovereign tech and have their own winners?
> In a provocative GitHub post, machine-learning engineer Han-Chung Lee argued that even rosy internal numbers that do show AI-assisted productivity gains are suspect, as they’re produced to hit adoption targets no one can effectively audit.
Isn't this fundamentally what MBAs do with their time? Keep going with this analysis, because it goes much deeper... In my experience, BI is often a house of cards. A lot of times it's just narrative crafting, just like we're all encouraged to do when we write our resumes.
Can you embellish a story? Can you invent a convincing political narrative? As far as I can tell, that's the fundamental unit of US corporation.
> Traditional call noise canceling relies on those small onboard neural networks and can have difficulty isolating your voice in very noisy environments, which results in ambient noise leaking through or voices getting highly compressed, making it difficult to hear. Anker says the larger neural network available on the Thus chip, plus eight MEMS (micro-electromechanical systems) microphones and two bone conduction sensors to focus in on your voice, in its yet-to-be-announced earbuds will have significantly cleaner call audio, regardless of the environment.
Anyone who likes good noise cancellation, which is a lot of people.
Back in the day we just called it ML. But now you have to stop for a minute to read and determine what they’re talking about, because “AI” is primarily a marketing term.
I tried it on openrouter and set max tokens to 8192, and every response is truncated, even in non-thinking mode. Maybe there's an issue with the deployment, but in your link also shows it generates tons of output tokens.
My tinfoil hat theory, which may not be that crazy, is that providers are sandbagging their models in the days leading up to a new release, so that the next model "feels" like a bigger improvement than it is.
An important aspect of AI is that it needs to be seen as moving forward all the time. Plateaus are the death of the hype cycle, and would tether people's expectations closer to reality.
My purely unfounded, gut reaction to Opus 4.7 being released today was "Oh, that explains the recent 4.6 performance - they were spinning up inference on 4.7."
Of course, I have no information on how they manage the deployment of their models across their infra.
I was there too, but honestly after today, 4.7 "feels" just as a bad. I was cynical, but also, kind of eager for the improvement. It's just not there. Compared to early Feb, I have to babysit EVERYTHING.
> Cuyahoga Valley: There is nothing wrong with Cuyahoga Valley. Statistically, you’re from Ohio, so why not?
In college, I took an interim elective course on geology of the national parks. On the first day of class, the professor asked an icebreaker for students to say which national park they lived closest to. I said Ohio - Cuyahoga Valley.
Well some snot nosed boy scout confidently piped up that there were mostly certainly no national parks in Ohio, and the professor agreed. This is a deep personal grudge that I still hold to this day.
Dry-nosed Eagle Scout here to relieve you of your grudge. There is of course as you know a national park in Ohio and it is wonderful. Grew up right along its edge, and I'm forever grateful for it!
This project started from a belief that llms should be better at doing python to cython code translations than they are. So we started setting a large set of parallel implementations.
Then I realized that Claude code was much better at working on the data using tools (mcp) to check and iterate. The scope transformed into an platform for creating the SFT agentic trace dataset using sandboxed tools for compilation, testing, linting, address sanitizing and benchmarking.
We still need to bulk up the GRPO dataset with a large number of good unmatched python examples. But early results using SFT only on gpt-oss 20b are quite good.
There are so many reason if you look at how it's being sold.
* We need to completely deregulate these US companies so China doesn't win and take us over
* We need to heavily regulate anybody who is not following the rules that make us the de-facto winner
* This is so powerful it will take all the jobs (and therefore if you lead a company that isn't using AI, you will soon be obsolete)
* If you don't use AI, you will not be able to function in a future job
* We need to lineup an excuse to call our friends in government and turn off the open source spigot when the time is right
They have chosen fear as a motivator, and it is clearly working very well. It's easier to use fear now, while it's new and then flip the narrative once people are more familiar with it than to go the other direction. Companies are not just telling a story to hype their product, but why they alone are the ones that should be entrusted to build it.
I am working on a large scale dataset for producing agent traces for Python <> cython conversion with tooling, and it is second only to gemini pro 3.1 in acceptance rates (16% vs 26%).
Mid-sized models like gpt-oss minimax and qwen3.5 122b are around 6%, and gemma4 31b around 7% (but much slower).
I haven’t tried Opus or ChatGPT due to high costs on openrouter for this application.
In the US with our broken system of capitalism, it’s the only way we can tether these companies to reality. Left to their own devices, I’m not convinced they would actually compete with each other on price.
Buy nobody like to talk about how “moat” building is fundamentally anti-competitive, even in name.
Funny that self proclaimed capitalists hate the system in practice. Commodity pricing is what truly terrifies them.
reply