Anon 12/08/2023 (Fri) 21:30 No.8980 del
Bash implementation for: 4chan rendered thread webpage ctrl+a,ctrl+c,ctrl+v -> TXT file -> extractor -> text files per-post

Done, mostly
Bash implementation for: Desuarchive thread webpage source code -> HTML file -> extractor -> HTMLs files per-post. Proof/example (remote pin=w3s):

Both methods can take like one minute or more to proccess a thread.

[1] see
[2] via
>$ curl -sL > 20278564.htm
>$ cat 20278564.htm | perl -pE "s/<div class=\"post stub stub_doc_id_/\n<div class=\"post stub stub_doc_id_/g" | perl -pE "s/^<aside class=\"posts\">\n//g" | tail -n +191 | head -n199919991999 | head -n -203 | sed "s/ <\/aside>//g" | xxd -p | tr -d \\n | sed "s/../&/g" | perl -pE "s0a/\n/g" | xargs -d "\n" sh -c 'for args do id=$(echo $args | sed "s/.*22%20%69%64%3d%22//g" | sed "s22.*//g" | sed "s///g" | xxd -p -r -); echo $args | sed "s//g" | xxd -p -r - > $id.html; done' _; rm .html
>$ # this partly helped: # 337 posts incl. OP & ls | wc -l = 339 & diff. of 2 = how file and complete thread file

