Hacker Newsnew | past | comments | ask | show | jobs | submit | onlyrealcuzzo's commentslogin

> It obviously went through lots of files in both prompts but total cost? Just $0.09 for the Pro version.

When people say that LLMs aren't worth it, it kills me.

A lot of us, on average, make $100+ an hour. $0.09 is < 4 seconds of our time.

You can't even read the vast majority of prompt responses that fast.

LLMs will continue to get better (I'm doubtful at previous rates, all indications are showing that progress is slowing and costs are increasing disproportionately).

It seems like >50% of devs think LLMs provide less than 0 value. I just do not get it.

Did they use an LLM one time 3 years ago and decide it's never going to be worth it? Have they even tried? Or have you only ever tried it on 1 giant, monolythic proprietary codebase where they're a total expert and decided that an LLM isn't as good as them, so it's "completely worthless"?

They are shockingly unhelpful on my company's codebase.

But that doesn't mean they are flat-out worthless.


I know I'm guilty of making this sort of argument sometimes, but it's just not valid.

I don't get paid for every waking hour of every day. Often I'm using an LLM for something that's uncompensated, so my hourly wage equivalent is irrelevant.

And for times when we might use an LLM for something related to paid work, it's still money out of your paycheck (unless the employer is paying for it; go nuts in that case). And it's not like using the LLM lets you go home early if it saves you time. You just end up doing more work.

I still use them because they're a useful tool sometimes. But I don't pretend it has negligible or no cost. (Not to mention the externalities around electricity use, crazy data center buildout, skyrocketing GPU and RAM prices, etc.)


Second -> this seems like something that might be cool.

But as someone who's probably as close to your target audience as you can get -> it's not clear to me what exactly this does, and when I would need it.

That may mean I'm not in your target audience, but then I suspect that audience is very small.

My critique:

> Pollen is a self-organising mesh and WASM runtime written in pure Go. Workloads are "seeded" into the cluster and organically scale and follow load. There is no central coordinator; decisions are made deterministically, locally, using a gossiped CRDT runtime state as their source of truth. Same view of the world; same workload placement and routing.

Sentence one is fine. It could probably be less mumbo-jumbo-y.

Sentence 2 should be paragraph 2.

Your actual sentence 2 should be along the lines of: what is a self-organizing mesh, and when is it useful (IMO).

I also would suggest not using CRDT right away. I think you might have a lot of people that might be interested in this, but don't know exactly what that is or why it's useful.


> but don't know exactly what that is or why it's useful.

I hate to say it, but the only applications I can think of can be best categorized as illegitimate, likely clandestine, distributed computing tasks.


I'm unaware of a better solution for local-first / offline-first software syncing (which IMO should be more common)!

They're also great for collaborative text editing or if you're building a distributed database (not many people, but I'm in an adjacent field).

At MASSIVE scale (inherently not many people), they're also good for things people take for granted, like counting (and other things people don't take for granted).

Again, it's not clear to me exactly where and why Pollen helps in any of these scenarios.


Further, LLMs consistently think LLM written content is "good".

Ask an LLM to write some design doc for you, wait until you get one that's very bad, send it to other LLMs and get their feedback, they will typically have good things to say.

Compare that to a very well written document you have. They will typically have a lot more bad things to say, even if the premise is solid.

Someone should study this.

LLMs clearly have a lot of value. But IMO this is very interesting and points out a weakness that's not entirely clear what the full ramifications of it are.

I suspect LLMs also have a major bias to code they write.

Take something universally considered to be well written like Redis, feed it to an LLM for feedback. They'll probably find much to pick apart (and a lot of it may be flat out wrong).

Feed the same LLM some clearly garbage LLM repository. Do they have a similar response as they do with design? Do they treat language different than code, and they're just susceptible to the way they write regular language that's different from logical code? Or do they have the same problem?

Has anyone done this?


> I really need to learn more about Zig, but from what I know, there are still worlds of possibilities that a modern, well-designed language offers over something like lib0xc.

Doesn't Apple have a nice `defer { }` block for cleanup? Did you include that in lib0xc? I didn't see in on your README.


I think defer has been included in the next round of working group proposals for C2y, but I don't think Apple's clang has it. Maybe it's there as a language extension and I just didn't see it.

What lib0xc has is some cleanup attributes that you can apply to variables to e.g. automatically free a heap allocation or close a file descriptor, at end of scope. Personally, I like variable annotations much more than defer for these uses, but they accomplish the same thing. I've also found that using those attributes inherently pushes your code to make ownership more explicit. I personally stopped being terrified of double-pointers and started using them for ownership transfers, which eliminates a large class of bugs.


> I've also found that using those attributes inherently pushes your code to make ownership more explicit. I personally stopped being terrified of double-pointers and started using them for ownership transfers, which eliminates a large class of bugs.

This is very interesting. Do you have a practical example?


Yeah here's a trivial one.

void *__free p = NULL;

func(&p); // func zeroes p to claim ownership

// end of scope, p is NULL, nothing happens // if func was not called, p is freed


In C++ you can implement such a thing using destructors, which are guaranteed to run in reverse order on scope exit even in the presence of exceptions. Alexei Alexandrescu's Scopeguard did this (in the 90s I think, long before C++11). But in standard C, there's no mechanism that this could be attached to (especially if you want to use "C exceptions", a.k.a. setjmp()/longjmp()).

Maybe the compilers they support all have non-standard extensions that allow something like this though?


Yes, because all compilers support a non-standard defer mechanism, its now being considered for inclusion into standard C. [0]

And that suggested defer standard, is already available from GCC 9 and clang 22.

[0] https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3734.pdf


I've found that for non-trivial features, I typically benefit from 3-4 rounds of: are you sure this isn't tech debt, are you sure this is thoroughly tested for (manually insert the applicable cases, because they aren't great at this, even if explicitly asked), are you sure this isn't re-inventing wheels, adding unnecessary complexity by not using existing infrastructure it should or that other existing code would not benefit from moving to this, are you sure you can't find any bugs, in hind sight, are you sure this is the best design?

Then, after it says, yes I'm sure this is production ready and we're good to move on, you have Codex and Gemini both review it one last time, and ask it to address their feedback if it's valuable or not.

After all this, it's the only time I'll look at the code and review it and make sure it's coherent.

Until then, I assume it's garbage.

I'd estimate this still improves velocity by 10x, and more importantly, allows me to operate at a pace I couldn't without burning out.


working this way would drive me nuts

Why? It's not that different from managing engineers.

You're just getting less work done on a slower cadence and asking the questions in design review and in code reviews...


it's very different. LLMs don't behave like people. they don't learn.

i don't mind managing people, but i don't want to manage machines unless i can control them with the precise languages that the commandline and programming languages use. prompting a LLM is to vague an interface for me, the outcome is to unreliable, to unpredictable.


> You already lost me here.

Agreed.

I'm working on a language designed for machines to write and humans to understand and review.

It doesn't seem worthwhile to have code nobody can understand.


> I'll be more careful with that framing.

I think you should also try to do a better job selling the benefit of this.

As a data engineer, I can see why this might be useful, but glancing through your README, the dots were not completely connected


Make sense. Reviewing the README is on my TODO list. Thanks for the heads up!

> Pre-agent, there wasn't always an obvious difference between models. Various models had their charms. Nowadays, I don't want to entertain anything less than the frontier models. The difference in capability is enormous and choosing anything less has a real cost in terms of productivity.

It's just apples to oranges.

There is not a clear, across the board, winner on non-agentic tasks between Gemini, ChatGPT, and Claude - the simple chatbot interface.

But Claude Code is substantially better than Codex which itself is notably better than Gemini-cli.

In this vein, it should not be surprising that Claude Code is way better than non-frontier models for agentic coding... It's substantially better than other frontier models at specialized agentic tasks.


I’ve been comparing Claude Code and Codex extensively side by side over the past couple of weeks with my favorite prompting framework superpowers…

From my perspective, Claude Code is decidedly not better than Codex. They’re slightly different and work better together. I would have no issues dropping CC entirely and using codex 100%.

If you’re working off of “defaults”, in other words no custom prompting, Claude Code does perform a lot better out of the box. I think this matters, but if you’re a professional software developer, I’d make the case that you should be owning your tools and moving beyond the baked in prompts.


CC is not better than Codex, nor is it better than OpenCode, Crush, Pi etc…

I think there's a fair amount of evidence that the heavy harnesses actually drag down performance compared to bare harnesses.

> They knew how to write Rust, but clearly weren't sufficiently experienced with Unix APIs, semantics, and pitfalls.

The point of Rust is that you shouldn't have to worry about the biggest, easiest to fall in pitfalls.

I think the author's point of this article, is that a proper file system API should do the same.


Presumably, every company that has non-LGPL CC code in production wants to own it...

"Own" as in "be responsible for". Nobody is too keen to own a pile of semi-working trash, and extensive vide-coding can produce such piles easily.

Not sure why this is being down voted. Outsourcing work doesn't also outsource accountability.

Anyone can produce low-quality code, with or without AI. Agents have gotten exceptionally good however, and everyone should be including them in their workflow if they're able to.

Agents are more prolific. As with any power tool, they increase both your ability to build and to wreak havoc, depending on how you handle them.

Yea, that is how I meant it.

> "Own" as in "be responsible for". Nobody is too keen to own a pile of semi-working trash

And yet that was the state of software at every company I worked at before FAANG, and even a good amount there...


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: