This would be half OK if it worked, but you can't trust it to. OneDrive, for instance, has an open bug for years now where it will randomly revert some of your files to a revision from several months earlier. You can detect and recover this from the history, but only if you know that it happened and where, which you usually won't because it happens silently. I only noticed because it happened to an append-only text file I use daily.
A specific implementation (OneDrive) doing something dumb doesn't invalidate the entire paradigm though. Things work just fine elsewhere (Dropbox, Google Drive, Nextcloud, and Seafile are all solutions I've had good experiences with).
Agreed, I’ve been using Dropbox for 15 years with minimal issues. The key is to ensure it’s running and syncing with the proper settings on both machines.
What can get things into a weird state is if both machines are editing the same file while only one of them is actively syncing. But for basic backup and sync, this is extremely rare.
Unlimited strings are a problem. People will use it as storage.
No, I'm not joking. We used to allow arbitrary paths in a cloud API I owned. Within about a month someone had figured out that the cost to store a single byte file was effectively zero, and they could encode arbitrary files into the paths of those things. It wasn't too long before there was a library to do it on Github. We had to put limits on it because otherwise people would store their data in the path, not the file.
I remember someone telling me that S3 used to be similarly abused - people were creating empty files and using S3 like a key-value store somehow, so AWS just jacked up the price of S3 head-object API call to push people back to DynamoDB or whatever.
Not sufficient, unfortunately. The strings for file paths are stored in wholly different infrastructure with wholly different optimizations. It probably lives in your database. You really don't want people just stuffing gigabytes into that, payment or no payment. Odds are you didn't plan your control plane around, "what if someone uses our strings as encoded data?"
In the fine print, only to be used against bad actors (w/guarantee that filenames under x chars would never be charged), or that too problematic? building good faith into policy + "hiding" info...
Reason - to not overcomplicate or give appearance of nickel-and-diming
No, just charge for the amount of storage they use on your server. Not the amount of data you think you’re storing. In non-special cases these will be the same number.
Would there be any engineering/management pushback on the customer side? "we have to write a tiny script", "this is non-standard" / "why are you the only ones who charge us for filenames?"
You expect the files to still be accessible using relative paths. What do you expect to happen if your cloud storage file path is 50 characters long and is mounted in a folder which is 4050 characters long when PATH_MAX is 4096?
The sync application itself can handle this using openat(2) or similar and should probably be using that regardless to avoid races.
Ah, I forgot that the maximum path length is usually limited by PATH_MAX, it's the path segment that's usually limited by the filesystem.
Point taken, although I still think it's better for cloud storage services to err on the side of compatibility, i.e. what's the lowest common denominator between Linux, macOS, Android, iOS from 10 years ago and Windows 7?
Oh yeah... I remember Windows behaving weirdly when I tried to copy some files with long names into a deeper directory tree. And it was just weird behaviour - no useful error message.
Windows in particular supports at the API level paths tens of thousands of characters long, much longer than Linux. The problem is applications need to explicitly support such paths using the long path syntax, otherwise they're limited to 255 characters.
Except the GNU stuff, which has as a design principle "no arbitrary limits". Meaning no limits at all, not "no sane limits":
Avoid arbitrary limits on the length or number of any data structure, including filenames, lines, files, and symbols, by allocating all data structures dynamically.
I assume they're relying on the OOM Killer and quotas to prevent DoSes all over the place.