Anon
12/15/2023 (Fri) 20:53
No.9053
del
I've been running this type of workflow for the past months, on and off. Was fixing for corruption (I save blocks to two HDDs and the non-ZFS one got a bit messed up):
>$ # If you have part of a CID in one repo/HDD and want all of it and have a complete copy in another repo, then run this:>$ cid=QmQh8RwLvQv91b8rLrmbL4zJE4v5Rg9gPU64R9EUdopunW>$ has_all=/z2/b/ipfs/.ipfs>$ has_part=/mnt/n/b/ipfs/.ipfs>$ ipfs pin add --progress $cid 2> >(tee b.txt >&2); h=$(cat b.txt | sed "s/.* //g"); echo $h; IPFS_PATH=$has_all; ipfs dag export $h > $h.car; IPFS_PATH=$has_part; ipfs dag import --stats --pin-roots=false $h.car; rm $h.car>$ # repeat previous command 14 times>$ !!; !!; !!; !!; !!; !!; !!; !!; !!; !!; !!; !!; !!; !!; !!Elegant method didn't work for some reason. Oh, it's because it uses sh and not bash:
>$ seq 1000 | xargs -d "\n" sh -c 'for args do ipfs pin add --progress QmQh8RwLvQv91b8rLrmbL4zJE4v5Rg9gPU64R9EUdopunW 2> >(tee b.txt >&2); h=$(cat b.txt | sed "s/.* //g"); echo $h; IPFS_PATH=/z2/b/ipfs/.ipfs; ipfs dag export $h > $h.car; IPFS_PATH=/mnt/n/b/ipfs/.ipfs; ipfs dag import --stats --pin-roots=false $h.car; rm $h.car; done' _>_: 1: Syntax error: redirection unexpected>>9051>[crawls of wikis] were often very problematical and poorly constructed no mater what I did.Just use grab-site. It writes WARCs which capture the actual native format of web/http content. grab-site works in GNU/Linux and maybe also Windows 10. I use it:
https://github.com/ArchiveTeam/grab-site>The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Message too long. Click here to view full text.