The warnings firing off hours later is obviously awful design, but the warnings are just warnings. The spend caps are something different and Gemini has them at the very least.
For most use cases where businesses use the cloud hard spending caps are an awful idea anyway. Killing your servers the moment you start picking up loads of new customers is a surefire way to kill your big growth opportunity at exactly the wrong time.
Of course, if you're not planning for sudden massive growth, you'd be crazy to host your stuff with the big three cloud providers.
> We had a budget alert (€80) and a cost anomaly alert, both of which triggered with a delay of a few hours
> By the time we reacted, costs were already around €28,000
> The final amount settled at €54,000+ due to delayed cost reporting
So much for the folks defending these three companies that refused to provide hard spending cap ("but you can set the budget", "you are doing it wrong if you worry about billing", "hard cap it's technically impossible" etc.)
Yeah, that the main reason I never use services like Google Cloud if I don't have to, it's impossible to have a hard cap, and anyone pretending to be an expert, is just off.
Google says that they can't provide a hard cap because that would mean shutting down all your services..bla bla, but at least give users the option.
We have spend caps at the billing account level and the project level (developer set) in the Gemini API now. There is up to a 10 minute delay in processing everything but this should significantly mitigate the risk here: https://ai.google.dev/gemini-api/docs/billing#tier-spend-cap...
By default, new Tier 1 paid accounts can only spend $250 in a given month.
I just find it extraordinary that the biggest tech company in the world can do cutting edge real time AI for millions of people, run Youtube and of course all the other google services with having literally the smartest people in the world and unlimited resources on board, but still can't keep real time track of the user's current billing and their spending limits, it's all best effort still. Somehow it doesn't add up. (Pun not intended, but I'm happy to have it)
I'm sure it's me being an idiot, but once again I spent 20m trying to figure how to do a specific thing in google-land and still haven't figured it out. Even if I did set it somewhere, I see things like "Setting a budget does not cap resource or API consumption" with a link to a bunch of documentation I have to analyze.
It shouldnt mean shutting down all your services, it should mean not letting you provision new ones and limiting the scope of what you can continue doing.
If I budget enough to store 1TB of data for 1 month, then on the first day of the month I store 2TB of data - what should the behaviour be after 15 days?
Read/write access should be frozen, data should be saved for 1 month so you have time to react to warning emails. If you didn't upgrade in that time, it should be deleted.
Nuke the data. It’s gone forever if you didn’t back it up elsewhere. This should be a meaningful risk mitigation that I can employ to avoid having a catastrophic financial disaster.
This isn’t a limit I’m setting at some percentage above expected costs, it’s: “I don’t want to take out a HELOC if something goes wrong”
Unfortunately, a lot of people keep their backups in the same cloud account as their primary data. Thinking that multiple copies and multiple availability zones are sufficient.
For these users, the article’s €54k bill would be replaced with their business data getting wiped out.
If you have a lambda set up that normally runs a hundred times a day, and suddenly it tries to spin up 10 million instances, it should block that unless you specifically enable it.
You know that's not how the cloud works. If you're build by the hour for compute and that compute is powering a server, the only way to stop that is by shutting off the compute, breaking the server.
I would love to have a “if the bill for this hobby project becomes a threat to my ability to pay my mortgage, nuke it.” If I cared about the data enough. I’d have backed it up.
That's actually crazy. So I can build a project I love, that does good, but somehow get in a situation where I'm accidentally paying 30.000€ (or 50.000€) to a big tech company? How is that fair? I mean yes, as a software engineer, you ought to reflect on all possible weaknesses, but there was a time when overlooking something meant something completely different than being down 30/50k. That is actually life-altering.
Your kid can do this in a smartphone game designated suitable for children, heavily optimized to exacerbate the possibility, and depending on where you live they can just choose not to refund you.
When the FTC went investigating a decade-ish ago they found Facebook saying the quiet parts out loud: it was all extremely deliberate.
you cannot earn billions a year and not be cheating your users out of their money. its that simple. they dont care for people, otherwise they wouldnt be putting so much effort in making them poor.
agree. the real problem isn't that hard caps are "technically impossible" — it's that the incentive to build them is backwards. a hard cap that stops a runaway process costs the cloud provider money. a "budget alert" that fires after the fact costs the customer money.
the 10-minute delay in billing processing is doing a lot of work in that logankilpatrick comment. at $4k/minute burn rates, that's still a $40k exposure window
If that happens, you create a support ticket and AWS/GCP/Azure wave it, especially the first time. They're aware that billing per usage can have surprise effects, but at the same time they don't want to kill their customers' workloads and delete their data, so it is what it is.
Exactly! I know, some of those companies sometimes refund you, but if your livelihood depends on it..? That's a crazy situation to be in as a mere developer.
It's quite easy to check responses to other customers in other threads there, and somehow I see quite a lot of "oh, go to that other support" and ghosting.
If you create support ticket on hacker news, then yes, you will probably get it waved. It's somewhat sad that HN is their support forum now.
Google has specifically said that certain API keys like Firebase are not secrets (since people will find them)... though Gemini then ended up changing stuff. https://news.ycombinator.com/item?id=47156925
This should be illegal. If a contractor your hired to swap out a tile on your bathroom floor billed you for remodelling your back garden, you would obviously have the legal right to refuse that.
Not if your contractor had you first sign a 15 page contract that commits you to whatever costs they dream up and requires forced arbitration by a corporate friendly firm when any dispute arises.
Because that's somehow normal in today's tech world.
Slightly OT, but I've always taken a dim view of this sort of thing for consumers because the parties are never at equal parity, either in ability to understand the legalese they're agreeing to, or the ability to seek alternatives.
Legal contracts for consumers should be written at whatever the prevailing reading level is, and the government should step in the more monopolistic position a company is in.
It infuriates me to no end how preferential government is towards corporations vs individuals.
In jurisdictions where beastiality is legal, then yes, from the libertarian perspective, that's all freedom of contract, baby. I'm not defending either beastiality or libertarianism, but the logic is that you don't want the government deciding what two private entities can and can't freely agree to.
We're pretty far from the Lochner era in the US, where even minimum wage laws were held to be unconstitutional violations of a very broad view of freedom to contract. But it is still a principle in most legal system.
My guess is that at least in Europe they would have a good chance fighting this in court and getting their money back, but it’s a pain having to go through such a lawsuit.
"We can either charge per tile, per job or on demand. Or you can have us on call for a year and get any of the former at a discounted rate."
"Per tile. Lay tiles until I say stop"
>you fall asleep
"Wtf why are you still laying tile"
"You said per tile and lay until you say stop. That'll be 50k please"
The cloud services wrote the contract and the UI for their console. They then encourage young developers to try out their tools and encourage a market environment where those skills are needed to secure employment. Some kid goes and tries to build their first web app, they follow instructions and tutorials but miss that a single default selection on a menu three nested layers down is going to cost $2,000 per month. This isn’t disclosed on the page. Sure, it can be determined by reading several different documents, but the provider chose to not show estimates for costs in the setup.
"Can you lay tiles until I say stop, or until it's about $250 worth, whichever comes first"
"No, as one of the top tile layers in the country I can't do that, for your own protection. What if fifty elephants came and wanted to use your bathroom all at once? You'd feel pretty dumb having to reject them instead of me simply automatically adding $1 million to your bill"
But in this example, the last line of your story is the customer going “yeah, sounds good, let’s do it and hope that doesn’t happen” and signing the agreement.
You hire a contractor and agree they'll bill you per tile, regardless of how many tiles there are. They bill you per tile. End of story.
For a more acurate comparison, consider a utility. You agree to pay for your electic bill. It's not the utility's fault you invited all your friends who decided to run a crypto mining LAN party, and they can't cut you off lightly because it might literally kill you (e.g. you live in a hot place and rely on AC to stay alive).
As a manager I avoid Google Cloud for this kind of customer-service disasters; but as someone who has dealt with large-scale billing systems in the telecom world, probably similar to that of Google Cloud, I am not surprised that it takes 10 minutes to consolidate all the usage logs of a customer for billing.
For telephony, it sometimes takes days when roaming is involved.
You have to imagine TB/sec of data, if not more, coming from thousand of potential sources, and queuing for aggregation to the proper company account, all having to be auditable. This is not a small engineering feat and it can't be real-time.
With that said, telcos usually include in their business model around 2-3% of bad debt (i.e. revenue that won't get paid), which accounts for frauds like this one. Given that the customer seems in good faith and has taken measures upon being notified, Google should manage this bill shock a bit more elegantly.
Moreover, the fact that this happened immediately after this key opened the AI gates means that pirates permanently scan for the permissions of all the keys they could gathers. Google could and should detect that and act upon it.
> The Gemini API supports monthly spend caps at both the billing account tier and project levels. These controls are designed to protect your account from unexpected overages, and the ecosystem to ensure service availability
The problem is it's specific to that API and defaults to uncapped so people who aren't using it and haven't heard about the issues with the Firebase API keys probably won't have set them.
Spend caps exist for Gemini (Maxious linked them) - they just default to OFF. For an API that can bill four figures per hour, opt-in safety by default isn't a UX choice, it's a billing strategy
Except that Google's own statements are extremely clear that "leaked" (i.e. public) API keys should not be able to access the Gemini API in the first place: "We have identified a vulnerability where some API keys may have been publicly exposed. To protect your data and prevent unauthorized access, we have proactively blocked these known leaked keys from accessing the Gemini API. ... We are defaulting to blocking API keys that are leaked and used with the Gemini API, helping prevent abuse of cost and your application data." https://ai.google.dev/gemini-api/docs/troubleshooting#google...
For extra clarity on the exact so-called "vulnerability" that Google identified, see: https://news.ycombinator.com/item?id=47156925 This describes the very issue where some API keys were public by design (used for client-side web access), so the term "leaked" should be read in that unusually broad sense. Firebase keys are obviously covered, since they're also public by design.
(As for "Firebase AI Logic", it is explicitly very different: it's supposed to be implemented via a proxy service so the Gemini API key is never seen by the client: https://firebase.google.com/docs/ai-logic Clearly, just casually "enabling" something - which is what OP says they did! - should never result in abuse of cost on the scale OP describes.)
Real time spend limits are probably never going to happen. Actual $ amounts are calculated by a centralized billing system offline in batch.
It sounds easy but it’s bonkers complicated, because of things like discounts, free tiers, committed usage, currency conversions and having to support every payment and deal structure in GCP.
Individual eng teams rarely actually think in dollar amounts, they think in the abstraction which is quotas.
I was selling a house in a state I no longer lived in, and was under contract to close the sale, when I got an email from the water company. It told me they suspected based on my water usage that there was a leak on the properly.
There had been a very cold February night (like -15F) and a pipe froze inside the walls, and it was just absolutely gushing out. They sent me the email after it had been leaking for a WEEK. I asked a friend to check it out and she said that the laminate floor went "squish" when she stepped in the front door.
Fortunately I was covered by homeowner's insurance since I could prove that my heat had been on, but that was a very unpleasant "warning" to receive!
These companies can sell your personal information in a microsecond in an advertising auction, but somehow can't figure out how to give you timely alerts that stop their cash flow.
The funny thing is that the website only has firebase auth, without any ai features.
The default api key that was created (before the ai was even released a few years back), someone got it from the website and started using the gemini api with the key.
This is clearly setup for VC backed companies where shareholders don't care about spend as long as they can brag about investing in this cool start up at dinner parties. Normal and true business should stay away.
You mean openrouter.ai. And yes, on reading this blog post, I immediately reviewed my API keys in OpenRouter to make sure that they were capped. My prod key was capped at $20/day (phew!) but my dev key had no cap, which I just updated. What a horrible story.
You can set it to auto top up if it drops below a certain amount. If you do that, then it would definitely be wise to add a cap. They let you add daily/weekly caps, which is convenient.
> So much for the folks defending these three companies that refused to provide hard spending cap ("but you can set the budget", "you are doing it wrong if you worry about billing", "hard cap it's technically impossible" etc.)
Yes, it's technically+business impossible. To implement a hard cap, a bill never to go over, they'd have to cut your service, but also delete all your data in databases, object storage, data lake, etc. This is simply not an option, so they take the different option of authorising support to wave surprise surcharges / billing DDoSes.
Even if you manage to get your microservices to synch every penny spent to your payment account at realtime (impossible) you still have to waiver the excess, losing some money every time someone goes past their quota.
Sure, but 80 -> 28,000 -> 54,000 is a hell of a lot of slippage.
Trading platforms can guarantee a maximum slippage on stops, and often even offer guaranteed stops (with an attached premium), so I don’t see why Google and Firebase can’t do similar.
Yep. And cloud providers could eat any slippage cost (enforcing, say, every 5 minutes by stopping service) without even a rounding error on their balance sheets.
The fact that they don’t indicates that there’s no market reason to support small spenders who get mad about runaway overages, not that it’s technically or financially hard to do so.
> Trading platforms can guarantee a maximum slippage on stops
Yeah no, physically impossible. If nobody is selling at that price, there is no guarantee your sell stop will execute near that price. They can sweep the market, find the best seller price and execute.
There might be a costly way to do it with microservices as I indicated, but your example easily falls apart.
They can take the other side of your other themselves, lose money sometimes, but make it up in the premium they charged you in the first place (or in the old days, from your other trading fees or your monthly subscription payment).
Cloud providers would be taking way less risk interacting with their own services than a broker does interacting with the market. Perhaps they would be more at risk from bad actors, but it shouldn't be significant: they could reserve this behaviour for people who have already spent, say, $100 with them so you can't abuse it at scale.
If they are a market maker, they can buy/sell at or near your stop. It might be a bad idea for them, but if they have a guarantee, this is how they will do it. Or, it will be like the Amazon guarantee (refunding free shipping on your late order).
Not impossible to do: they can hedge and/or absorb the cost, hence the premium. They usually also specify a (fairly large) minimum distance for such stops.
That's exactly what I proposed in my response. Big corp can waiver the extra costs to match your limit. Glad we finally got to that part of my response. The question is: will they? Probably not. Do brokers do it? I haven't seen any. Maybe you know more.
I'm with you. And what do you even do when the quota is breached, nuke the resources? People will complain about that just as much as overspends.
I don't buy the 'evil corp screwing people' angle either. They are making farrr too much legit money to care about occasionally screwing people out of 20k and 50k.
If I set a limit, and you cut off my service because I reached the limit, I would definitely not "complain just as much" as if I set a limit and you allowed me to spend past it.
We're not talking about an EC2 or EBS volume here, this is access to an API.
Why aren't we talking about an EC2? Because this is a thread about the Gemini inference API. Loss of service would be restored on payment, not permanent. But that's besides the point: I as the customer set a limit, and you as as the service provider did not adhere to it.
I've worked on a number of systems and while it is sometimes impossible to stop at an exact limit, I am confident that it is feasible to stop with less slippage than occurred in this scenario. And at the companies I've worked at, within a margin of error that we're able to absorb any slippage ourselves, as these losses are made up elsewhere, and are worth the customer goodwill. If we can do it, I'm sure Google can.
Believe me, if they tried as hard as FB, an ordinary user wouldn't stand a chance. As it is now, a single uBlock Origin action is enough to make these disappear from my YT page for good, and for all accounts.
Like being paid more (because they work in more dangerous or risky jobs) or having higher suicide rate or being able to wear clothes repeatedly without being judged. Not being forced to wear makeup
I don't get it. People who take dangerous or risky jobs nobody else wants do get paid more. Higher suicide rate has nothing to do with some privileges being taken away. As for clothes, it depends on your social circle - in mine people couldn't care less. As for make up, I have no idea what you mean.
I agree, I meant that armies engaged in conflicts are male like all armies have been in history, save the Soviets who had female battalions for propaganda purposes
> The usual suspect would first need to cross Poland, not to mention finishing what they started in Ukraine.
It doesn't look like they can break Ukraine anytime soon. And each month Ukraine bites back pushing the prospect of full-scale war with NATO (or what is left of it) even further.
I guess we're living in two different areas of Europe. And regarding the last point:
> And the EU regime plans what? To send European military age men to die in a faraway foreign country fighting for foreign interests while their homes and way of living are under attack.
First of all, there is no "EU regime", only countries threatened by Russia daily, which decided they need to increase their defence spending to deter Russia. Europe collectively decided NOT to send people "to die in a faraway foreign country fighting for foreign interests" in spite of Trump's pressure to do so.
Identity verification to use an API?? And via Persona? I can't say if it's real. But if they really try to enforce that, I guess goodbye Anthropic forever.
They were all the same from the beginning. Every tech company of a certain size and significance eventually begins collecting data and sharing it with state actors, as far as I can see.
reply