wget (or curl) the entire contents of a forum thread?

2

1

The url of the forum thread I'm trying to get access to is in the form

http://domain.com/showthread.php?threadid=3333333&userid=0&perpage=40&pagenumber=1

I've tried

wget --user-agent=Mozilla/5.0 -k -m -E -p -np -R http://domain/showthread.php?noseen=0&threadid=3333333&pagenumber=1

and I've had no luck.

wgethelp

Posted 2011-02-04T00:37:38.327

Reputation: 21

Answers

1

Why not just in a for loop:

for pageno in {1..1000000}; do
    wget ... http://domain/showthread.php?noseen=0&threadid=3333333&pagenumber=$pageno || break
done

or perhaps a while loop is better, if a little longer to write:

i=1
while true; do
    wget ... http://domain/showthread.php?noseen=0&threadid=3333333&pagenumber=$pageno
    if test $? -ne 0; then
        break
    fi
    i=$((i+1))
done

Mikel

Posted 2011-02-04T00:37:38.327

Reputation: 7 890

OP here, thanks Mikel.

Actually I'm having problems just downloading the contents of the one page in the thread. Once I have that going I was thinking of the first way you suggested, but there seems to be something wrong with my wget paramaters (? maybe? I don't know) – None – 2011-02-04T02:01:37.530

0

Might be worth checking if the forum supports rss feeds of sections / threads. It'd save you bother.

Sirex

Posted 2011-02-04T00:37:38.327

Reputation: 10 321