More questions! Anon 05/13/2024 (Mon) 08:34 No.10369 del
(136.09 KB 250x273 3195882.gif)
>>10367
More advanced:
This is very broad and something I don't feel like I can even attempt to cover with any sort of justice right now. A whole host of issues are faced with archival right now. There is significant issues on several fronts. All of the free data and generous terms of the 2000s and 2010s internet are dying. Plus a rise in censorship and desires to rein in the old civil libertarian spirit of the Internet (for good or ill, one cannot deny that a lot of things will be wrongly caught in the crossfire) Endangering too many things to count. The central archives that underpin a lot of web history are also under potential risk for a variety of reasons. One bad lawsuit or the wrong person calling it quits (in the case of someplace like archive.today or The Pony Archive) might mean the loss of YEARS of archive work. I don't believe it is within the average person means to save everything everywhere but if a lot of people put a little effort a lot more could be saved then we expect if these places ever go down. Realistically, any long term advanced archiving should take these factors into account.

Hardware: This really deserves it's own section. Simply downloading and storing fair bit is pretty easy now and can be done with Potato PCs and a few external hard drives. Full on data hoarding with a plan of keeping something available years or even decades requires a bit more careful planning.Plus plenty of inbetween! From an old Optiplex, cheap default configured NASes to full on enterprise grade servers. There is a lot of set ups that could work for a lot of different people. One thing to remember though: multiple backups! Not everyone can go full 3-2-1 method but unless your storing it short term it is good to have two copies of something at least. Also learn about bit rot!


Glossary:
For some of the terms you may see around here frequently (needs expansion but this is a start).

IPFS
InterPlanetary File System: something that has been getting a lot of use around here lately. It is a distributed decentralized network and protocol. Think BiTTorrent without a central server (but more than that) would be a simple TL:DR. I like what Archivist said here on it:>>10173
>General idea, from one perspective. Are you interested in BitTorrent, but wish it was as elastic and expansive as the web? IPFS may be your solution! It takes good ideas from various things, such as HTTP and BitTorrent. Similar to the web, which has many various things, try not to rely on others to host IPFS data. In BitTorrent you can somewhat rely on other peers to host "important data" (read: some retarded TV show/movie/anime/video game/etc.) forever. Can't expect other peers to host your HTML or folder forever.
I could also invoke comparisons to ZeroNet and Freenet but I think those might be more obscure!

WARC
Web ARChive: a file format that is designed specifically for archiving websites. This isn't the same as downloading a single web page and is, to simply put it, much better at getting a site intact than most manual scraping.

Message too long. Click here to view full text.