This might just be the frequency illusion at play, but there seem to have been a number of high-profile supply chain attacks of late in major packages. There are several articles on the first few pages of HN right now with different cases.
Looking back ten years to `left-pad`, are there more successful attacks now than ever? I would suspect so, and surely the value of a successful attack has also increased, so are we actually getting better as a broad community at detecting them before package release? It's a complex space, and commercial software houses should do better, but it seems that whilst there are some excellent commercial products (e.g. CI scan tools), generally accessible, idiot friendly tooling is somewhat lacking for projects which start as hobby/amateur code but end up being a dependency in many other projects.
I've cross-posted my comment from the current SAP supply chain attack thread [0].
>This might just be the frequency illusion at play, but there seem to have been a number of high-profile supply chain attacks of late in major packages.
I think it is a real increase in the rate of detected attacks, not just awareness, but whether that’s an increase in vigilance or an increase in attacks is hard to know. I suspect both, of nothing else because awareness drives both vigilance and attackers inspired by the earlier attacks.
I looked pretty hard, with some LLM assistance, so if it was "are we just hearing about it more now" it would have to be old attacks that happened without being discovered and written up.
Yeah, and ultimately no body cares. Everyone assumes it’s just some process miss, and we need to add another step to the process and move on. Fuck ups that would have killed the credibility of projects 10 years ago are now treated as “eeh what are you gonna do. Sometimes you ship malware. Will look into it”
Some of us are very aware and concerned about the risk. But like Cassandra from Greek mythology, we see the coming disaster and feel powerless to stop it.
More like hiding their heads in the sand in circumstances that are outside of their ability to fix. None of the tooling or practices out there push you in the direction of not being at risk, or even provide you with easy ways to stay completely safe: no external packages needed to develop software with everything you NEED being provided out of the box, or a flow where pulling in a new package makes you review all of its source code line by line and compile everything instead of any binary tooling blobs, or built in vulnerability and configuration scanning so you don't get pwned by Trivy or don't leave an open S3 bucket somewhere, which also means that obviously you'd need thorough observability and alerting for any of the cloud stuff you do.
And even when they exist, your org projects might be painfully out of date, too much to use those approaches, or the org culture might not be there, or any number of other issues I can't even imagine. On one hand, people are running out of date software and those have CVEs, on the other using dependencies that are too new also puts you at risks of compromised packages - it's like we're being squeezed by rocks on both sides in a landslide or something. Even at the OS level, the fact that everyone is not running something like Qubes OS or regular VMs for development is absolutely insane. The fact that all software isn't sandboxed and that desktop OSes don't prompt for permissions like mobile apps do is absolutely insane. That we don't have firewalls like Glasswire as standard that prompt you for external connections, or don't allow easily blocking what you don't trust is insane.
Despite lots of people trying their best, on some level, everything both up and down the stack is absolutely fucked for a variety of complex reasons. You'd have to largely tear it all down and rebuild everything starting with your OS kernel in a memory safe language and formal proofs and thorough testing for everything (if it took SQLite as long as it did to get a decent test suite, it might as well take on the order of decades to do it for a production OS kernel and drivers), then do the same for all userland software and DBs and tooling and dependency management and secrets management (not just random files, special hardware most likely) and so on. It's not happening, so we just build towers of cards.
Personally I was sus of Tailscale, but stopped even pondering it after they got an RCE.
Same with npm and large dependency trees with 10.5 line libraries of low quality.
Lighting always seemed to be the leftpad of PyTorch. It was basically a replacement for a for loop and a couple of backward/step calls. I'm sure now it grew to replace a few more lines of code though. Like maybe a 100.
If you want to look for a coming disaster, look no further than HuggingFace libraries that for some reason quite a lot of projects use these days, especially transformers package. Sadly even vllm depends on it.
Are you talking about open source or commercial products?
I can't speak for the pytorch lighting case, but I wouldn't be surprised if the maintainers didn't get any $ from it. They would be sad if the credibility of the package suffers, but ultimately it wouldn't make a big difference to them
The reason is that auto-updates and CI tools have reached a critical saturation and everybody uses them. Years ago, `npm install` would have been more likely to be run manually, and only if something in the build breaks - which means once in a blue moon. Supply chain attacks depend on people (or more likely, pipelines) mindlessly auto-updating packages as soon as they are released.
> idiot friendly tooling is somewhat lacking for projects which start as hobby/amateur code but end up being a dependency in many other projects.
Historically, extra-security-scanned artefact handling has been a paid enterprise option. Whereas the less secure option is the much-less-hassle default.
IDK how good a business model this is, I suspect not very.
FWIW left-pad was not an attack, it was a bug in NPM. It should not be possible to unpublish package versions that are depended on by other published packages. On the other hand, it should be possible to unpublish certain package versions that are new and not depended on.
NPM should have returned error codes when the author of left-pad attempted to remove all his data with the intention of leaving the service.
To quote Wikipedia:
> After Koçulu expressed his disappointment with npm, Inc.'s decision and stated that he no longer wished to be part of the platform, Schlueter [author of NPM] provided him with a command that would delete all 273 modules that he had registered.
> Looking back ten years to `left-pad`, are there more successful attacks now than ever? I would suspect so, and surely the value of a successful attack has also increased, so are we actually getting better as a broad community at detecting them before package release?
The value has increased, and that is what drives all these attacks. Cryptocurrencies are to blame in particular because they not just provided a way for money laundering the proceeds but also a juicy target in itself.
And what is stolen with today's malware? Cloud credentials. Either to use for illicit mining, which is on the decline, or to run extortion campaigns, which is made possible by cryptocurrencies. All too often it's North Korea or Iran running these campaigns.
The attacks from TeamPCP were successful at stealing credentials recursively. So it is very likely that someone working on this pytorch related package may have recently pulled the bad litellm or trivy (or what was there like 8 others?)
And the reason it jumps from npm to pip to whatever is that it's trying to find all the user's keys in well known locations for any of these repos.
So teampcp is sitting on tens of thousands of passwords or keys and they just need time to run tests on them to figure out what packages they can release to get even more attacks out there.
Why all the major repo vendors haven't done a full cred wipe? No idea (unless they have and I just wasn't on the email list)
> Looking back ten years to `left-pad`, are there more successful attacks now than ever?
I can't vouch for the number of attacks, but, and since we are talking about Python, nothing substantially changed since the time of `left-pad`. The same bad things that enabled supply chain attacks in Python ten years ago are in place today. However, it looks like there are more projects and they are more interconnected than before, so, it's likely that there are either more supply chain attacks, or that they are more damaging, or both.
Here's my anecdotal experience with Python's packaging tools. For a while, I was maintaining a package to parse libconfuse configuration language. It started as a Python 2.7 project, but at the time there was already some version of Python 3 available, so, it was written in a way that was supposed to be future-proof.
I didn't need to change the code of the project in the last ten or so years, but roughly once a year something would break in the setup.py. Usually, because PyPA decided to remove a thing that didn't bother anyone.
When Python 3.13 came out, as clockwork, setup.py broke. I rolled up my sleeves and removed the dependency on setuptools, instead, I wrote some Python code that generated a wheel from the project's sources. I didn't look up the specification of the RECORD file in dist-info directory, and assumed that sha256().hexdigest() will generate the checksums in the desired format. And that's how I shipped my packages...
Some time later, the company added an AI reviewer to the company's repo and it discovered that instead of hexdigest() the checksums have to be base64-encoded and then padding removed...
Now, to the punchline: nobody cared. The incorrectly generated packages installed perfectly fine without warnings. Nobody checks the checksums.
More so: nobody checks that during `pip install` or the more fancy `uv pip install` the packages aren't built locally (i.e. nobody cares that package installation will result in arbitrary code execution). It's not just common, it's almost universal to run `pip install` on production machines as a means of deploying a Python program. How do I know this? -- The company I work for ships its Python client as a... source package. Not intentionally. We are just lazy. But nobody cares.
> It's not just common, it's almost universal to run `pip install` on production machines as a means of deploying a Python program.
Maybe a Python culture problem; maybe a hallmark of Python's status as an "easy to hire for", manager-friendly, least common denominator blub language; maybe a risk that stems from the conveniences of interpreter languages... but this is such a shame in this day and age.
It's seriously not difficult to do better. And if this is what you're doing, you're also missing out on reproducible environments both in dev and in prod. At least autogenerate a Nix package! You still don't need to publish any artifacts, but you can at least have the thing build in a sandbox or yeet the whole closure over SSH.
It's also not that hard to get a Docker image out of a Python project.
You only need one platform-minded person on the whole development team to make this happen.
"Almost universal" is a bit of a stretch, most of the time these days Python apps are deployed as Docker containers, and if you're using k8s this becomes effectively mandatory.
However a lot of the time especially for older codebases the docker build will just run pip install from public pypi without a proper lockfile.
So at least install code isn't being executed on your production machine, but still significant surface area for supply chain attacks
Well, the install code can leave some code behind that will be executed on the production machine... It doesn't really help being in a container. While a separate problem from Python ecosystem, people really put a lot more faith in isolation offered by containers than they should. Also, it's often very tempting to poke holes in that isolation because it's difficult and up to impossible sometimes to get things done otherwise.
It's probably the same people, who think that merely having a requirements.txt stating packages with versions or even without that (2010 sends its regards) is fine. Open a random open source Python project on GitHub, and chances are you will see this kind of thing. Stands to reason, that people in companies are not acting much different.
As scary as it is right now, it warms your heart a little bit that this system existed for 30 years and is only now reaching a crisis point.
I ran an open source project with tens of thousands of downloads (presumably all either developer machines or webservers, so even a small number is valuable) and never received a malicious pull request, offer of a bribe to install malware, or a phishing attempt with enough effort to even catch my attention.
What it says to me is that there weren't a lot of people working on the crime side of this. It's like dropping your wallet in a bar bathroom and coming back to find it still there.
virtualenv isn't relocatable out of the box, so how else would you deploy a python project?
You can call it laziness, but it's not like the python ecosystem has ever developed an answer for this problem. The only reasonable answer has been to use docker, which is basically admitting that the python community did nothing.
> virtualenv isn't relocatable out of the box, so how else would you deploy a python project?
My team has a handful of Python projects. Here's how they work:
devenv.nix provides a Python runtime and all native dependencies, git hooks for linters and things like this. It integrates with direnv and the Python package manager (currently Poetry 1.x for older projects and uv for newer ones) so that when you cd in you get a virtualenv with everything you need, scripts in the project (or stubs for them) magically appear on your PATH so you don't need to use `uv run` or whatever it is for anything.
flake.nix provides a publishable artifact for projects that we run on workstations or servers. It autogenerates a Nix package from pyproject.toml and friends. You can reproducibly build it across platforms without virtualization, you can push it up to a binary cache and avoid source builds, whatever. It's great.
For projects that we run in cloud-native containers (for us AWS Fargate and AWS Lambda), we don't currently ship our own container images. We just publish zip files that we generate with a Poetry plugin that runs builds inside containers that have the same images as are used by AWS in its default runtime environments and push them up with the AWS CLI. The exact steps are stored as a Devenv script so the CI can be a one liner and you can run everything locally just like you would in CI.
> the python community did nothing
Python sucks.
But you can still represent your Python project as a proper Python package and get reproducible-ish build artifacts that are local-first and embrace Python-native tooling and ship it up to prod in a portable format with or without Docker. It only takes one engineer spending a day or two to work it out once for the whole team or maybe the whole company. You just need someone to be willing to RTFM on a package manager or two. The Python community seems to be largely lacking such people but your team doesn't have to be.
Oh, that's a sore spot with me, but I'm glad you asked!
So, for the purpose of full disclosure, I have a personal and professional grudge with PyPA, which also touches on how pip is being managed, beside other packaging issues. It's not the side you want to be on, so, be warned!
So, without further ado: I write my own code to generate the deployed artifact. In my case, I take all the wheels installed in my environment, extract them, and merge them into a single wheel. The process also usually involves removing a bunch of junk from the packages packaged in such a way. You'd be surprised how much nonsense people put in their distributed packages... like, their unit tests, or documentation in HTML / PDF format, __pycache__ files (together with the sources)... the list goes on.
But, it works because I curate what's being installed. I don't trust pip to install just or everything I need. I run it in a separate environment, where I examine the packages that have been installed as dependencies, figure out why any of these packages were installed (you'd be surprised how often you don't need them!), then, I make a list of the dependencies I actually need, with the exact versions and checksums, and use a Python or a Shell script to download and install them in the actual development environment.
This isn't a good idea when you have many short-lived projects, but, in my case, the typical project lifespan is measured in decades and there aren't that many of them. So, I can expend the extra effort required to do that.
Unfortunately, I don't think there's a way to automate the process. The key point is that there's a human who sifts through dependencies and figures out what to do with them. Partially automate, maybe... but I can't think of a way to make this into a program that I could give someone.
No need to invoke frequency illusion when every moderate HN lurker already stopped counting. https://socket.dev/blog gives a good impression, but a dedicated article would be nice. Maybe recurring once or twice a year.
If you're interested in synchronicity and frequency illusion, Sergei v. Chekanov wrote a book that sounds interesting https://jwork.org/designed-world/
Have you ever experienced coincidences that cannot be logically explained? This book helps the readers understand the meaning of synchronicity, or remarkable coincidences in people's lives. This work not only explains the mystery of synchronicity, originally introduced by Carl Jung, but it also shows how to make simple calculations to estimate the chances that coincidences are not due to mere randomness.
This might just be the frequency illusion at play, but there seem to have been a number of high-profile supply chain attacks of late in major packages. There are several articles on the first few pages of HN right now with different cases.
Looking back ten years to `left-pad`, are there more successful attacks now than ever? I would suspect so, and surely the value of a successful attack has also increased, so are we actually getting better as a broad community at detecting them before package release? It's a complex space, and commercial software houses should do better, but it seems that whilst there are some excellent commercial products (e.g. CI scan tools), generally accessible, idiot friendly tooling is somewhat lacking for projects which start as hobby/amateur code but end up being a dependency in many other projects.
This is a brilliantly accessible study, and ripe for some fun follow-ups. It would be interesting to replicate in other continents and with different species. I suspect the mystery will be readily resolved through hypothesis testing under controlled conditions.
Hopefully this doesn't seem like advertising - I'm not affiliated with the project in any way. I just particularly enjoy playing with cyberdecks [1] and stumbled upon this while browsing. I continue to have a lot of fun with the uConsole and SDR, but I've long wanted a Cardputer with a bit more oomph. I should add, if anyone is interested in a uConsole, brace yourself for the shipping times... [2].
Interesting! There's a lot I don't know about this, but I know a little more now. I'll admit, I naively thought this would be more regular than it appears to be [0].
The Earth is generally expected to spin more slowly over time, due to tidal friction. But it has been spinning faster and faster since the 1960s. As shown in the figure in the wikipedia article [0].
I have read numerous explanations, but haven't found a really authoritative discussion.
Wow that's really interesting. A great quote form the article:
> In 2021, it was reported that Earth was spinning faster in 2020 and experienced the 28 shortest days since 1960, each of which lasted less than 86399.999 seconds.[24] This caused engineers worldwide to discuss a negative leap second and other possible timekeeping measures, some of which could eliminate leap seconds.[25] The shortest day ever recorded was 29 June 2022, at 1.59 milliseconds less than 24 hours.[26] In a 2024 paper published in Nature, Duncan Agnew of the Scripps Institution of Oceanography projects that the water from increasing ice cap melting will migrate to the equator and thus cause the rate of rotation to slow down again.[26]
They teach us Scientific Realism in school, but reality is that we are really using Instrumentalism.
That said, no one wants to admit it, so contemporary science follows Falsification, where we find ways to not actually make claims about reality. (Which as an Instrumentalist/pragmatist, I love Karl Popper, its just not metaphysical truth. And that would break Popper's heart)
I'd argue the opposite is true for anyone who has studied statistics which is largely built on Instrumentalism (think George Box: 'All models are wrong, but some are useful') and Popperian falsification (Null Hypothesis testing). We are absolutely taught to treat models as predictive tools rather than metaphysical truths.
Statistics is even presented like metaphysical truth. Or at least my experience in engineering school.
And taking fluid dynamics, we used renyolds number, which is a made up ratio that helps for decision making... Its not like when we answered questions, we could answer the grey area we are discussing.
If I had to guess, I think its due to western civilization being built of Platonism (and even Aristotle was infected). Our science and morality is later built by platonic realism. Only in the last 100-ish years are we starting to get over it.
A distinction without a difference. The only way we can interact with the world is via senses, via instruments, via measurement. We can rehash solipsism, but seeing as how that is an immediate dead end we all agree there is a physical reality. If there is in fact a reality, then we are measuring something real.
I think it matters. No the planets are not doing circles around the sun. Circles don't actually exist, they are doing elipses.
Also 'real' has quite a few meanings. If I ask the question 'Are you closer to a keyboard or the gym?' does that question exist?
This kind of stuff does end up mattering. It becomes much more noticeable in psychology (and biology). If you read Freud, Adler, or Jung, you will say 'Oh extrovert! I've seen that before!' But then you realize its vague and almost always true. Its like a horoscope.
So if we think there is a truth to reality, we look for perfect relations. If we think its impossible for humans to figure out, we look for best fits.
I'm very interested to see how some VPN providers react to this. For a zero logs VPN provider, if such a thing can really exist, how big of a problem is this? Presumably many customers pay with a debit/credit card already so there's some PII on file? Usage remains the same? Surely savvy people can just use their existing VPN to buy a VPN from outside the UK.
Of course, we're sliding quite rapidly down that slippery slope here so I'm sure logging and easier government tracking would be next. The justifications will get weaker and even more lacking in supporting evidence for their implementation.
> Presumably many customers pay with a debit/credit card already so there's some PII on file?
Yes. But I think most of the zero logs providers will remove the identifiable payments details after a certain about of time. e.g. Mullvad have a specific policy relating to what is stored and retention time (I am not affiliated with Mullvad, I just use their service).
I believe a whole host of VPN providers have no real need to comply with this amendment if it passes the Commons.
The providers are structured in a way that makes forcing compliance difficult and have built their whole business model around this. NordVPN is registered in Panama for example and Mullvad lets you send cash in the mail and doesn't store any user details (even a hashed email).
It'll be interesting to see how & who reacts if it does pass.
> Now more than ever, trusting a US jurisdiction VPN provider ? No thanks !
The whole point of Obscura is you aren't trusting any single company. A Swedish company and an American company would need to collude to cause a problem. Unless you know something I don't?
> The whole point of Obscura is you aren't trusting any single company.
First, Mullvad's infrastructure has been independently audited.
Mullvad integrity has also tested as proven by a legal case where they were subject to a search warrant when someone was trying to claim copyright infringement.
As far as I can tell, Obscura has not had anywhere near the same scrutiny.
Second, obscura is the first hop is it not ?
Therefore it may well "only" relay the traffic to the exit node but it is still a relay and hence open to SIGINT analysis by the US.
I would have thought therefore using Mullvad's built-in multi-hop mode on their audited platform would be the wiser decision ?
Hence why Mullvad is being used as the exit point.
You have full e2ee between yourself and Mullvad but crucially Mullvad don't know who your IP. Five eyes are already doing SIGINT on behalf of both the US and the UK government before my connection even reaches Obscura so I lose nothing but potentially gain privacy.
How is it you think a single company (Mullvad) having access to my IP and what I am browsing is less secure than splitting it up amongst multiple providers one of which being Mullvad with that audited platform you talk about?
If I wanted Tor on top I'd layer it on top too but that would still be a single point of failure.
It's open source which means I can trust having the app installed if I build from source (or I can just use Wireguard directly). I then know I'm directly connected to a Mullvad Wireguard node by checking the public key here: https://mullvad.net/en/servers
Other than Wireguard protocol being broken there is no way for Obscura to snoop presuming I check the public key. I'm not saying I trust Obscura, I'm saying with their model I don't need to trust them which is vastly superior. Nor do I need to trust Mullvad.
You keep hand waving around that Obscura are somehow untrustworthy but you have steadfastly refused to address the fact that their model does not require trust. If you trust Mullvad (which you are claiming to) please show an attack that would work to breach this model. You can't.
Sadly if you look at how the law is drafted its setup to catch companies that have a significant UK base not just those that advertise here. It is highly likely for compliance reasons (as we saw with imgur and others) that they will simply block the UK themselves.
This seems to have been posted a few times over the years, e.g. [0]. I was impressed and pleased to see that I hadn't missed the boat on this, and in fact, the project seems to still be going very strong [1]!
Perhaps a bit cynical, but it seems that as Microsoft continue to shove ads in absolutely everywhere and track everything they possibly can, Apple are content to be just marginally better rather than actually having meaningfully higher standards. Of course, it's business as usual, but we are boiling the frog for the next generation by tolerating it.
The problem is that these systems are so costly and hard to make that without a capital incentive no indipendent entity is going to make them and what entity do have an interest in making them as a "loss leader" if not monetized in any way (ads or paid product)?
Apple has been running maps for well over a decade without this. They are one of the most profitable businesses in history and have spent almost a trillion dollars in financial games to enrich stakeholders because they had so much cash to burn.
The idea that "poor little Apple is struggling without enshittifying to microptimize profit opportunities" is an utter joke.
I think they were talking about the challenge for a non Apple/Google competitor to emerge in this space with a comparable enough product to win real, meaningful marketshare.
Firefox's abysmal market share, despite being, for the average user, a strictly better experience, would incline me to agree.
Yeah Apple’s evolution over the past decade has been very frustrating and disappointing to see. It seems like whatever scraps remain of the company’s core values now exist solely with a handful of old heads at the company and will likely not survive their retirements.
A lot has changed in the tech industry, but the rapidity of hiring and expansion of headcount just seems to have engendered a broad homogenization of business strategies, design conventions, and product vision. I think they started hiring people based on narrow defined ideas about skills and resumes to fit certain roles and they all end up shuffling the same bunch of people around across the same incestuous company hiring pipelines until they’re all doing stints at every company and driving them in the same broad direction.
Looking back ten years to `left-pad`, are there more successful attacks now than ever? I would suspect so, and surely the value of a successful attack has also increased, so are we actually getting better as a broad community at detecting them before package release? It's a complex space, and commercial software houses should do better, but it seems that whilst there are some excellent commercial products (e.g. CI scan tools), generally accessible, idiot friendly tooling is somewhat lacking for projects which start as hobby/amateur code but end up being a dependency in many other projects.
I've cross-posted my comment from the current SAP supply chain attack thread [0].
[0]: https://news.ycombinator.com/item?id=47964003
reply