> So the currently delayed feature development is now gonna be further delayed, yet almost every week we see new features and changes, just the other day the single issues view was changed, as just one example.
What's the question here, you don't believe growth is currently exponential, or do you think it shouldn't be hard to scale, when 10x YoY is not enough?
As a business user, our costs have gone up while service has gone down dramatically. Meanwhile our marginal cost to GitHub has hardly changed. Where our costs to them have increased, they mostly charge us per cpu minute, so obviously aren’t making any kind of loss on our account.
I’m sure they’re experiencing scaling issues across the platform, but it’s unacceptable for that to have a negative impact on us when we're sending them $250/dev/yr for (what is in all honesty) hosting a bunch of static text files.
I understand that, and maybe GitHub became a bad deal because of that.
But if anything, their post and your reply are precisely an endorsement of usage based billing.
The bit that's growing 13x YoY (and which they expect will easily blow past that) is unmetered - commits. The bit that is metered (for some, not all folks) - action minutes, grew only 2x YoY.
GitHub was not built to limit the number of commits, checkouts, forks, issues, PRs, etc - nor do we want them to - but that's what's growing ridiculously as people unleash hordes of busy beaver agents on GitHub, because their either free or unlimited.
Where there are limits - or usage based billing - people add guardrails and find optimizations.
Because for all the talk, agents don't bring a 10x value increase; otherwise, they'd justify a 10x cost increase.
Besides, other forges are having issues too. Even running your own. We have Anubis everywhere protecting them for a reason.
That sounds bad. Paying users don't want huge and every-growing numbers of freeloaders reducing the return for each dollar they spend...
That would only lead to further and further degradation of service until the paying customers were absolutely desperate to find a deal that didn't require them to lug around such a heavy ball and chain.
It all made sense at the beginning when Github was free for OSS and OSS was thriving, but now these billions of commits are mostly incredibly low value. I'd bet the average commit now doesn't create 1/10th of the value the average commit did in, say, 2018
Our 20-dev company is unfortunately exactly the wrong size to justify self-hosting. We're not large enough that it can be someone's dedicated role, and we're not small enough that we can be experimental around our vendors for something so critical to our output.
We're actively looking into alternatives outside of GitHub though.
I'm curious how Azure DevOps reliability has been for comparison. My current job is managing stories in DevOps with SCC in GitHub ent. While I like Github slightly more, have been curious about the decision.
We use Azure DevOps at work for few things. It's been pretty rock solid since all agents don't recommend it and it's different architecture.
It's also legacy at this point since Microsoft is pouring all resources into GitHub but for most people/companies, they could probably use Azure DevOps just fine.
Concur on the rock solid comment. We use Azure Devops with git repos, lots of pipelines using self hosted or Microsoft hosted agents.
There was an issue with Microsoft hosted agents a few months ago, but that didn't last long, and is the only issue in my memory.
These numbers should have been in the blog post, not the graphs that are present.
> What's the question here, you don't believe growth is currently exponential, or do you think it shouldn't be hard to scale
I think you're putting words in my mouth here; I didn't say either of those things. I'm saying that this blog post is a meaningless platitude when the github stability issues predate this, and that all this post says is "we hear you're having issues".
I just think their charts, taken at face value, show substantially the same thing (for PRs, commits, new repos).
Either those charts are a bald-faced lie (the tweet could be as well) or there is no way for that chart to be something else.
The only way to fake exponential growth like that would be to use an inverse log scale (which would be a bald-faced lie).
It doesn't even really matter what's the y-axis baseline, unless we really think growth was huge in 2020, then cratered to zero by 2023, now back to the previous normal.
As for the rest of the post, I do think it's panic mode platitudes. But I honestly don't know what I'd write instead that's better.
You can already see people complaining loudly where they instead of "we'll do better" decided to limit usage.
> I just think their charts, taken at face value, show substantially the same thing (for PRs, commits, new repos).
The problem is that these charts show the massive exponential growth in 2026. But this didn't start in 2026, this has been going on since early last year. My team had more build failures in 2025 due to actions outages or "degraded performance" than _any other reason_ and that includes PR's that failed linting or tests that developer were working on.
> As for the rest of the post, I do think it's panic mode platitudes. But I honestly don't know what I'd write instead that's better.
IMO, this needed to be written a 6 months ago (around the time that the memo of them prioritising the migration to Azure was released), and then this post should have been "We're still struggling, this isn't good enough. Here's the amount of growth, here's what we've done to try and fix it, and here's what we're planning over the next 3-6 months", instead of "Our priorities are clear: availability first, then capacity, then new features" and "We are committed to improving availability, increasing resilience, scaling for the future of software development, and communicating more transparently along the way." This isn't transparency (yet).
I've used it to translate SQLite (with a few extensions) and, that I know of, it's been used (to varying degrees of success) to translate the MARISA trie library (C++), libghostty (Zig), zlib, Perl, and QuickJS.
More on-topic, I use a mix of an unevaluated expression stack and a stack-to-locals approach to translate Wasm.
Interesting. I started working on this same idea a couple of years ago as a way to bypass CGo. Eventually I moved on to something else. Glad someone else is working on this. How does the generated Go performance compare to the original WASM performance?
That's going to depend on what you mean for "original Wasm performance".
What were you using to run Wasm instead of this?
I can compare with wazero, which I was previously using, and say performance stayed mostly in the same ballpark. Things that crossed the Go-to-Wasm boundary very often became much faster, things that stayed mostly in Wasm became slightly slower, as the wazero compiler is pretty good.
wasm2go also does not support SIMD, so if your Wasm module uses/benefits from SIMD, you'll notice.
Go generates large Wasm modules, because it bundles its goroutine scheduler, garbage collector and standard library into the module.
Translating that back to Go will give you a pretty big Go file.
Go is "known" for being fast to compile, but that huge Go file will take (at least?) as long to compile as compiling the Go toolchain does.
wasm2go is best used on moderately sized modules (like SQLite). Last I heard, the person who tried to translate Perl got a 80MB Go file that was taking them 20min to compile.
Yes. Group #6 is optimal because everyone in group #4 appears selfish and uncaring for others. Group #5 has the moral high ground, but actually voting with group #5 is risky. Therefore the best option is to virtue signal you are in group #5 and get the moral benefits, but vote with group #4 and guarantee survival. Group #6 gives you the benefits of both.
This is similar to Newcomb's two-box paradox, where the optimal strategy is deception. The winning play is to preemptively convince everyone you're only going to take the second box, but then actually take both.
Unless you have a single "reader", you don't mind the delay, and don't worry about redoing a bunch of notifications after a crash (and so, can delay claims significantly), concurrency will kill this.
I wrote a simple queue implementation after reading the Turbopuffer blog on queues on S3. In my implementation, I wrote complete sqlite files to S3 on every enqueue/dequeue/act. it used the previous E-Tag for Compare-And-Set.
The experiment and back-of-the-envelope calculations show that it can only support ~ 5 jobs/sec. The only major factor to increase throughput is to increase the size of group commits.
I dont think shipping CDC instead of whole sqlite files will change the calculations as the number of writes mattered in this experiment.
So yes, the number of writes (min. of 3) can support very low throughputs.
Yeah the C API seems like a perfect fit for this use-case:
> [SQLITE_FCNTL_DATA_VERSION] is the only mechanism to detect changes that happen either internally or externally and that are associated with a particular attached database.
Another user itt says the stat(2) approach takes less than 1 μs per call on their hardware.
I wonder how these approaches compare across compatibility & performance metrics.
I just tested this out. PRAGMA data_version uses a shared counter that any connection can use while the C API appears to use a per-connection counter that does not see other connections' commits.
Reporting back. This appears to be a bug in my original test the code of which sadly I did not commit anywhere. I went back to regenerate these tests and proved the opposite - the C API is better than PRAGMA and works across connections. I am going to make that update as I've proved across dozens of versions of SQLite that this is not in fact the case.
Reporting back again. It seems I was actually right the first time - the C API's SQLITE_FCNTL_DATA_VERSION doesn't work cross connection. It is cached on each read - but if there aren't any reads (i.e. just polling SQLITE_FCNTL_DATA_VERSION) then it doesn't work.
PRAGMA data_version is pretty fast (1500ns with prepared statement) and doesn't have that issue.
Checking the wal-index is sub-nanosecond when mmapped but has slightly different behavior on Windows.
Here's the link to the thread on this, my scripts are all there.
Yep, definitely still in use. Do yall above have an opinion if the pragma is better than the syscall? What are the trade offs there? Another comment thread mentioned this as well and pointed to io uring. I was thinking that dism spam is worse than syscall spam.
I may be wrong, but I think you wrote somewhere that you're looking at the WAL size increasing to know if something was committed. Well, the WAL can be truncated, what then? Or even, however unlikely, it could be truncated, then a transaction comes and appends just enough to it to make it the same size.
If SQLite has an API it guarantees can notify you of changes, that seems better, in the sense that you're passing responsibility along to the experts. It should also work with rollback mode, another advantage. And I don't think wakes you up if a large transaction rolls back (a transaction can hit the WAL and never commit).
That said, I'm not sure what's lighter on average. For a WAL mode database, I will say that something that has knowledge of the WAL index could potentially be cheaper? That file is mmapped. The syscalls involved are file locks, if any.
Interesting, thank you for the response and explanation. Honker workers/listerners are holding an open connection anyway. I do trust SQLite guarantees more than cross-platform sys behavior. I will explore the C API angle.
reply