Anon 12/06/2023 (Wed) 06:15 No.8924 del
>>8922
What I did that worked - it finds ten consecutive lines of greentext given a set of posts as text files:
>$ curl -sL https://ipfs.filebase.io/ipfs/Qmb7pn6qDfb75QZx65W26JPiffZe5rGjJgpCC4R4hV6F4Y?format=car | ipfs dag import
>$ # make sure you have ipfs and curl installed
>$ ipfs get Qmb7pn6qDfb75QZx65W26JPiffZe5rGjJgpCC4R4hV6F4Y
>$ # puts the CID data into a folder in a storage device
>$ cd ./Qmb7pn6qDfb75QZx65W26JPiffZe5rGjJgpCC4R4hV6F4Y/4chan/mlp/thread/40219665/
>$ # change directory into the folder with the posts
>$ ls | head -n7
>40219665
>40221229
>40221685
>40221730
>40222638
>$ # one text file (without any file extension) per post
>$ find . -type f | xargs -d "\n" sh -c 'for args do echo $args; cat $args | perl -pE "s/\n/\0/g" | grep -Poa "\0\s*>\N*\0\s*>\N*\0\s*>\N*\0\s*>\N*\0\s*>\N*\0\s*>\N*\0\s*>\N*\0\s*>\N*\0\s*>\N*\0\s*>\N*\0" 1>/dev/null; echo $?; done' _ > 0match1.txt
>$ # save matches to file "0match1.txt", make sure that you have perl installed

Message too long. Click here to view full text.