Hacker Newsnew | past | comments | ask | show | jobs | submit | javawizard's commentslogin

It's better than bleen I suppose.

Speaking of, I'd be curious about a similar experiment but one that compares how grotesque, for lack of a better word, certain words sound. The word bleen makes me uncomfortable, I think because my brain automatically goes to spleen; grue isn't my favorite either but I prefer it to bleen.

I'm curious how universal that is though. Do others have similarly aligned preferences for one word over the other, or are our feelings about them more evenly spread?


This one in particular is going to be difficult to get good results. Depending on your era you may have been eaten by a grue at some point.

Not a native speaker, bleen for me got auto corrected by my brain to green. It doesn't make me uncomfortable, but I'd prefer grue because my brain will immediately understand we're talking about the umbrella term. If grue is said out of context, I'd imagine Gru from despicable me, when written I'd imagine gruel, but, again, because I'm not a native speaker, instead of yucky food I'd instead think about that episode of Masha and the Bear where they end up with a houseful of the porridge.


> 1. Github could choose to grandfather in those plans and make no changes until those plans expire.

They explicitly stated that they won't be doing that: the multipliers go into effect in June for everyone, annual plan or not.


That deserves to be on a plaque somewhere.

I've been using LLMs for much the same purpose: solving problems within my field of expertise where the limiting factor is not intelligence per se, but the ability to connect the right dots from among a vast corpus of knowledge that I would never realistically be able to imbibe and remember over the course of a lifetime.

Once the dots are connected, I can verify the solutions and/or extend them in creative ways with comparatively little effort.

It really is incredible what otherwise intractable problems have become solvable as a result.


What’s your field

Paint by numbers

I gotta say, I love my macbooks. Every Apple laptop I've owned that has USB-C ports will happily charge itself from a 5V/1.5A wall charger (albeit extremely slowly).

That hasn’t been my experience. I once tried to charge an M3 MBP via a lower powered wall plug. It was left off over night and the following morning the battery was still at 1%.

Note:

Some devices expect USB-A on the charger side instead of C

USB-A pump out 1A5V(5W) regardless of what's connected to it, then it negotiate higher power if available.

USB C-C does not give any power if the receiving device is not able to negotiate it


My work has a little power strip with a usb-c and usb-a jack on it at every desk. I can charge my phone and iPad just fine with a USB-C cable into the USB-C port, but when I plugged my MacBook Air into it, it says “not charging.” Going into the system information tool I can see it’s only running at 10W. So apparently 10W is not enough to charge, but it’s still at least keeping the battery from draining.

A 20w charger will definitely charge the MacBook, just slowly.


This was a decent USB plug from Anker. I regularly use it to charge things like iPhones and tablets. I knew it wouldn’t supply enough power to run the MBP but thought it should trickle charge the device over night. But it didn’t.

I can’t recall which cable I used though. The cable might have been garbage but I’m pretty sure I threw out all the older USB cables so they wouldn’t get mixed with more modern supporting cables.


What did it start at?


Not when the article they're commenting on was doing literally exactly the same thing.


Love this.

How did you implement gates? Are they simply tasks Claude itself has to confirm it ran, or are they scripts that run to check that the thing in question actually happened, or do they spawn a separate AI agent to check that the thing happened, or what?


Claude or whatever agent will get a message when it tries to close a task, which tells them which gates are not resolved yet, at which point, the agent will instinctively want to read the task. I did run into an issue where I forgot to add gates to a new project, so Claude did smoosh over by making a blanket gate, I have otherwise never had an issue when I defined what the gate is, Claude usually honors it. I havent worked on big updates recently, but I noticed other tools like rtk (Rust Token Killer) will add their own instructions to your claude's instructions.md file, so I think I need to craft one to tack on with sane instructions, including never closing tasks without having the user create gates for them first.

In a nutshell, a gate is a entry in the DB with arbitrary text, Claude is good about following whatever it is. Claude trying to close a task will force it to read it.

Life's gotten slightly busy, but you can see more on the repo. I've been debating giving it a better name, I feel like GuardRails implies security, when the goal is just to validate work slightly.

https://github.com/Giancarlos/GuardRails


It sounds like a gate is a prompt that shows up at the appropriate time, which works because LLM’s pay more attention to the last thing they read.

It seems like a lot of coding agent features work that way?


I suppose, I mean the LLM is still reading it, the issue is, Beads gives the model a task, and then the model finishes, and never checks anything. I kept running into this repeatedly, and sometimes I'd go to compile the project after it said "hey I finished" it wouldn't compile at all, where if it would have just tried to build the project, it would have just worked.


From my understanding the way Gas Town uses beads is that it's not only "what to do" but also contains a workflow.


Who closes the gate? Is it Claude itself after it runs the verification? Who makes sure the verification did in fact run?


I usually have Claude confirm with me but I've seen it close it if its a unit test that passed for example.


You can't trust it 100%. Sometimes it will just refuse to fix a compiler or lint warning (often saying "This was a pre-existing issue...") or write a trivial test that does nothing and always passes.


> writes code with a lot of warnings > compacts > "This was a pre-existing issue..."

I still take this over writing code myself though.


I'm not saying you shouldn't. I'd say 70% of my work code is written by Claude Code or Codex. But this is something you should be aware of when interacting with agents.


Point being that there are multiple gates to one story, including human testing as one of them.


I built something similar with verifiable gates tasks. The agent has a command to mark the task as done and it will run the bash script, if it passes the task closes, if it doesn’t it appends the failure information into the task description for the agents next attempt at the task.


The trouble with that argument, though, is that it works the other way as well: how do I, a random internet citizen, know that you're not doing the same thing for Anthropic with this comment?

(FWIW I have definitely noticed a cognitive decline with Claude / Opus 4.6 over the past month and a half or so, and unless I'm secretly working for them in my sleep, I'm definitely not an Anthropic employee.)


Oh it's pretty clear to me that Anthropic employs the same tactics and uses bots on socials to push its products too. On Reddit a couple of months ago it was simply unbearable with all the "Claude Opus is going to take all the jobs".

You definitely shouldn't trust me, as we're way beyond the point where you can trust ANYTHING on the internet that has a timestamp later than 2021 or so (and even then, of course people were already lying).

Personally I use Claude models through Bedrock because I work for Amazon, and I haven't noticed any decline. Instead it's always been pretty shit, and what people describe now as the model getting lost of infinite loops of talking to itself happened since the very start for me.


https://isitnerfed.org/

in short, it looks like nothing has been nerfed, but sentiment has definitely been negative. I suspect some of the openclaw users have been taking out their frustrations.


That's fascinating.

Any idea what their test harness looks like? My experience comes primarily from Claude Code; this makes me wonder if recent CC updates could be more to blame than Opus 4.6 itself.


> I know software quality has been going down in recent versions of macOS

Note that this particular problem has existed for well over a decade. It's atrocious, but let's not pretend it's anything new.


The macbook notch has existed for a decade?


No, menu bar items being hidden when there are too many of them has happened for a decade.

The notch has just made menu bar space more scarce than it used to be.


If you opened an app like Xcode with a lot of menus options, it would extend beyond across the screen and cover up your menu bar icons.

If I open Xcode today on a 14" MacBook, two menu items extend past the notch, and they still hide your menu bar icons.

This has been the case for a long long time, and it's always been an obvious failure case.


Menu bar icons overflowing. The notch just makes it a problem quicker, and in an exciting new way.


Yes, and: If I recall correctly, cloudflare is sinking all the extra traffic for him, so it doesn't actually impact him.

Last I heard it's a morally objectionable thing at this point rather than something that's having any practical impact.

(Which of course doesn't make it ok... I'm just a little less inclined to judge people that still use archive links when needed.)


I wonder if that was an automated HN edit?

Similarly to how titles that start with "how" usually have that word automatically removed.


Usually HN only auto-edits on first submission. If you go in and undo it manually as the submitter, you can force it to read how you intend.


Maybe I'm only noticing the times when it messes things up, but it kinda seems like these auto-edits cause a lot of confusion that could be avoided if they were shown up-front to submitters, who would then have the option to undo them.

Or maybe judicious use of an LLM here could be helpful. Replace the auto-edits with a prompt? Ask an LLM to judge whether the auto-edited title still retains its original meaning? Run the old and new titles through an embedding model and make sure they still point in roughly the same direction?


oh interesting, TIL I can go edit my submission titles! That's useful, I've definitely submitted stuff and gotten a less-good title due to the automated fixes, so I'll have to pay attention to this next time


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: