Hacker Newsnew | past | comments | ask | show | jobs | submit | balls187's commentslogin

Generating a QR code to see the graph online is kind of cool, but also kinda dumb too.

I mean, these days kids have smartphones, what's the point of a graphing calculator?


Ironically builtin smartphone calculators are really bad, and one of the best ones you can download might be Graph 89 (a TI-89 emulator).

Rant/Aside: Smartphones (or at least Android) are just generally really bad at being... smart, especially out of the box. No dictionary? No thesaurus? To say nothing of built-in encyclopedia (e.g. Wikipedia). Calculator worse than the $1 scientific ones? It's astounding how obvious it is that they're meant to dumb people down and just sell you crap when you look at the complete absence of basic functionality anyone from 50+ years ago might expect them to have.


>kids have smartphones, what's the point of a graphing calculator?

Many tests will not allow you to use a smartphone. My son couldn't even use the school issued chromebook on his PSAT, he had to get a loaner Windows laptop or use an approved hard calculator.


I'm with you. Some open source app is all they need.

However to answer your question: phone rules in classrooms vary enormously and the dedicated calculator is faster to interface when you're drilling problems in a homework setting

I finished highschool in the (gasp) 20th century so the modern classroom is certainly something I've had to learn


The comments on this are fascinating. Although, I was waiting for someone to chime in with "HP is better cuz RPN."

2 dinners out for a family of four would cover the cost of this calculator. If my kid's school required this for math, I wouldn't bat an eye at purchasing one.

I needed a Ti-83 for school in 1996-1998. If you couldn't afford one, the school would loan you one for the semester. Band instruments were the same way.


> I was waiting for someone to chime in with "HP is better cuz RPN."

Well, it is ;) The Swiss Micros clones are pretty awesome:

https://www.swissmicros.com/product/dm41x


I have 2 Swiss Micros and a pile of sapphire-chip designed-in-Corvallis USA-made HPs. The DM41x is pure joy in the hand. But I still texted the pink TI-84 EVO to my 16-year-old daughter because she doesn't like my stodgy TI-84 CE Plus (which I love).

Same story here. I still have the HP15C I learned assembly language on ;) None worse for the wear. They'll all come handy after the acopalypse.

I'd take a dm42n over any Ti anyday.

Claude does not have a "theory" of anything, and I'd argue applying that mental model to LLM+Tools is a major reason why Claude can delete a production database.

Well, humans also routinely accidentely delete production databases. I think at this point arguing that LLMs are just clueless automatons that have no idea what they are doing is a losing battle.

They’re not clueless they just don’t have a memory and they don’t have judgement.

They create the illusion of being able to make decisions but they are always just following a simple template.They do not consider nuance, they cannot judge between two difficult options in a real sense.

Which is why they can delete prod databases and why they cannot do expert level work


>they cannot do expert level work

Well this is just factually incorrect considering they are currently on par with grad students in some areas of mathematics.


Not sure if you are being pedantic but mathematics is quite different from other fields because it is highly structured, reasoning is explicit and it contains a dense volume of high level training data. Results are able to be verified easily due to its structure.

Even then, they are most effective in assisting and are not able to produce results independently. If you have proof otherwise I would love to read up on it


I like to think of LLMs as idiot savants. Exceptional at certain tasks, but might also eat the table cloth if you stop paying attention at the wrong time.

With humans, you can kind of interview/select for a more normalized distribution of outcomes, with outliers being less probable, but not impossible.


When you're applying reasoning like this, sure, why not? What difference would it make?

I mean maybe it’s a losing battle today, but it is correct. So in a few years when the dust settles, we’ll probably all be using LLMs as clueless automatons that still do useful work as tools

Well summarized.

We're also seeing that the people up top are using this to cull the herd.


> I've used Claude for many months now. Since February I see a stark decline in the work I do with it.

I find myself repeating the following pattern: I use an AI model to assist me with work, and after some time, I notice the quality doesn't justify the time investment. I decide to try a similar task with another provider. I try a few more tests, then decide to switch over for full time work, and it feels like it's awesome and doing a good job. A few months later, it feels like the model got worse.


I wonder about this. I see two obvious possibilities (if we ignore bias):

1. The models are purposefully nerfed, before the release of the next model, similar to how Apple allegedly nerfed their older phones when the next model was out.

2. You are relying more and more on the models and are using your talent less and less. What you are observing is the ratio of your vs. the model’s work leaning more and more to the model’s. When a new model is released, it produces better quality code then before, so the work improves with it, but your talent keeps deteriorating at a constant rate.


I definitely find your last point is true for me. The more work I am doing with AI the more I am expecting it to do, similar to how you can expect more over time from a junior you are delegating to and training. However the model isn't learning or improving the same way, so your trust is quickly broken.

As you note, the developer's input is still driving the model quite a bit so if the developer is contributing less and less as they trust more, the results would get worse.


> However the model isn't learning or improving the same way, so your trust is quickly broken.

One other failure mode that I've seen in my own work while I've been learning: the things that you put into AGENTS.md/CLAUDE.md/local "memories" can improve performance or degrade performance, depending on the instructions. And unless you're actively quantitatively reviewing and considering when performance is improving or degrading, you probably won't pick up that two sentences that you added to CLAUDE.md two weeks ago are why things seem to have suddenly gotten worse.

> similar to how you can expect more over time from a junior you are delegating to and training

That's the really interesting bit. Both Claude and Codex have learned some of my preferences by me explicitly saying things like "Do not use emojis to indicate task completion in our plan files, stick to ASCII text only". But when you accidentally "teach" them something that has a negative impact on performance, they're not very likely to push back, unlike a junior engineer who will either ignore your dumb instruction or hopefully bring it up.

> As you note, the developer's input is still driving the model quite a bit so if the developer is contributing less and less as they trust more, the results would get worse.

That is definitely a thing too. There have been a few times that I have "let my guard down" so to speak and haven't deeply considered the implications of every commit. Usually this hasn't been a big deal, but there have been a few really ugly architectural decisions that have made it through the gate and had to get cleaned up later. It's largely complacency, like you point out, as well as burnout trying to keep up with reviewing and really contemplating/grokking the large volume of code output that's possible with these tools.


Your version of the last point is a bit softer I think — parent was putting it down to “loss of talent” but yours captures the gaps vs natural human interaction patterns which seems more likely, especially on such short timescales.

I confusingly say both. First I say that the ratio of work coming from the model is increasing, and when I am clarifying I say “your talent keeps deteriorating”. You correctly point out these are distinct, and maybe this distinction is important, although I personally don‘t think so. The resulting code would be the same either way.

Personally I can see the case for both interpretation to be true at the same time, and maybe that is precisely why I confused them so eagerly in my initial post.


I don’t think the providers intentionally nerf the models to make the new one look better. It’s a matter of them being stingy with infrastructure, either by choice to increase profit and/or sheer lack of resources to keep n+1 models deployed in parallel without deprecating older ones when a new one is released.

I’d prefer providers to simply deprecate stuff faster, but then that would break other people’s existing workflows.


Point 2 is so true, I definitely find myself spending more time reading code vs writing it. LLMs can teach you a lot, but it's never the same as actually sitting down and doing it yourself.

I think it might have to do with how models work, and fundamental limits with them (yes, they're stochastic parrots, yes they confabulate).

Newer (past two years?) models have improved "in detail" - or as pragmatic tools - but they still don't deserve the anthropomorphism we subject them to because they appear to communicate like us (and therefore appear to think and reason, like us).

But the "holes" are painted over in contemporary models - via training, system prompts and various clever (useful!) techniques.

But I think this leads us to have great difficulty spotting the weak spots in a new, or slightly different model - but as we get to know each particular tool - each model - we get better at spotting the holes on that model.

Maybe it's poorly chosen variable names. A tendency to write plausible looking, plausibly named, e2e tests that turns out to not quite test what they appear to test at first glance. Maybe there's missing locking of resources, use of transactions, in sequencial code that appear sound - but end up storing invalid data when one or several steps fail...

In happy cases current LLMs function like well-intentioned junior coders enthusiasticly delivering features and fixing bugs.

But in the other cases, they are like patholically lying sociopaths telling you anything you want to hear, just so you keep paying them money.

When you catch them lying, it feels a bit like a betrayal. But the parrot is just tapping the bell, so you'll keep feeding it peanuts.


Freakanomics podcast had a recent episode regarding Cheating with PEDS, and interviewed the (former) head of the Enhanced Games. At one point, he discussed the benefit for society because athletes would be monitored for 5-years post performance.

To me, it seemed like a modern day tech-take of human cock-fighting.


In my opinion, the problem with PEDS isn't adults taking them if they would just admit to taking them.

The problem is with adolescents taking them. Adolescent boys see a really nice immediate payoff for taking PEDS (better musculature and better sports performance->more popular) while the downsides are in the future. It's really hard to fight that.

Even when I was in high school several decades ago, we had a handful of people on PEDS. And we were a tiny school with no significant sports programs. I can't imagine what it's like now with social media pushing everything.


> In my opinion, the problem with PEDS isn't adults taking them if they would just admit to taking them.

The incentive to cheat and hide was one of the points from the podcast. In Cycling, in order to win, you have to compete with other cyclists who are doping, and doing so in such a way that they are unlikely to get caught. In order to win, you have to dope and not get caught. Youre not forced to dope, but the option is there, and yours to take should you choose.


Honestly PEDS are stigmatized and under-researched for the performance enhancing aspect. They have undoubtable side effects - but how much, why, etc. is kind of meh from what I saw when I was looking into this, bro science is best you can get. Few studies here and there giving people modes test boosts and measuring athletic performance.

Not saying we should be promoting them, but if we can eventually get to the point where we eliminate the really bad side effects and get most of the benefits it's going to be a great thing for everyone, the next thing after GLP-1.


I do not have the background that allows me to make medical decisions based reading published medical articles, so I have to trust my doctors advice, and seek 2nd opinions if I'm not convinced.

My issue was the disingenuous use of a "5-year post compete" monitoring as justification for Enhanced Games.


I'm a lib, but enough is enough. Let gun owners have their guns. Let 3d Printers have their prints. Neither group is the problem.


Help me understand. Is this just AI replacing influencers?


More like a tiktok spam botnet for hire.


Thank you for explaining that.


John Lithgow had a take I agreed with: Her opinions were heavily misconstrued though she chose to double down at her own peril.


> and a few of them did get pretty far, but ultimately not a single one actually launched.

Having done this professionally for a very, very long time, software engineers aren't particularly good at launching products.

Technology has drastically lowered the barriers to bring software products to customers, and AI is a continuation of that trend.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: