this post was submitted on 16 Jun 2023
128 points (100.0% liked)
Technology
37826 readers
914 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Ultimately this is a problem that's never going away until we replace URLs. The HTTP approach to find documents by URL, i.e. server/path, is fundamentally brittle. Doesn't matter how careful you are, doesn't matter how much best practice you follow, that URL is going to be dead in a few years. The problem is made worse by DNS, which in turn makes URLs expensive and expire.
There are approaches like IPFS, which uses content-based addressing (i.e. fancy file hashes), but that's note enough either, as it provide no good way to update a resource.
The best™ solution would be some kind of global blockchain thing that keeps record of what people publish, giving each document a unique id, hash, and some way to update that resource in a non-destructive way (i.e. the version history is preserved). Hosting itself would still need to be done by other parties, but a global log file that lists out all the stuff humans have published would make it much easier and reliable to mirror it.
The end result should be "Internet as globally distributed immutable data structure".
Bit frustrating that this whole problem isn't getting the attention it deserves.
No offense, but that solution sounds like a pipedream that wouldn't work on a technical level. So you wish to keep not just the item someone published, but previous versions of it, have mirrors of it and tie it up in some sort of a blockchain thing. That sounds insanely more resource heavy than just hosting the document itself on one instance somewhere. It would be much more reliable sure, but currently even companies like reddit can struggle with all of the traffic, similarly with smaller open source projects like Lemmy instances or kbin, and your solution is to increase the amount of data?
It really isn't. Most content out there is already immutable, you don't see people uploading the same Youtube video five times with minor changes or editing their images after the upload, most services don't even allow that for users, at best you can delete and upload a new video.
Furthermore, the blockchain would only contain metadata, not the actual data, so it's automatically thousands of times easier to store than the data itself.
Mirroring that content is a complete separate and optional part of the problem, the important part is having content named in such a way that I can go to a mirror and ask "do you have XYZ" and get an answer that you can trust. With URLs that's impossible, as they can show different content whenever they want.
Also this isn't exactly a new idea, that's how most software development already works these days. A Git repository stores a copy of every little change, and every download retrieves that complete history. What's missing is some infrastructure on top of that that links all the different repositories together into one namespace (GitHub kind of does that internally, but that's of no help for repositories hosted elsewhere).
Ok, so what if this blockchain has a metadata link to a video, which is hosted somewhere, and i remove that video from that host? How is that different than just a URL pointing to that video if the blockchain just holds metadata?
I don't understand what you are solving.