10
2
Specifically, I would like to be able to download certain pages from my user profile on the various Stack Exchange sites. I would, however, like to do this automatically (using a cron
job), from the command line and in a parsable format. I much prefer using Linux for this, but I could get access to a Mac or Windows machine if necessary.
Ideally, I would like to use a tool like Wget or cURL to fetch the pages. I don't know how to get past the log in though. I have seen suggestions that mention that you can log in via Firefox, export the relevant cookie and import it into Wget through its --load-cookies
option. For example here and here. While this works if I have just logged in, it doesn't after a while. I guess because the ID token has to be refreshed.
So, just after logging in to SU and exporting my cookies I can do:
wget --load-cookies cookies.txt \
https://superuser.com/users/151431/terdon?tab=responses
After a few minutes though, I get a 404 error:
wget -O ~/stack/$(date +%s) --load-cookies ~/cookies.txt \
https://superuser.com/users/151431/terdon?tab=responses
--2013-08-06 04:04:14-- https://superuser.com/users/151431/terdon?tab=responses
Resolving superuser.com (superuser.com)... 198.252.206.16
Connecting to superuser.com (superuser.com)|198.252.206.16|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2013-08-06 04:04:15 ERROR 404: Not Found.
So, how can I automatically log in to an OpenID enabled website from the command line?
PS. I think this is better suited here than in web applications since my question is really about the command line aspect and not the actual details of the web page in question. I would guess that any solution will be applicable to all OpenID sites.
3
Have you looked into the SE API (https://api.stackexchange.com) to see if it provides the information you're looking for? This is the official way to get programmatic access the data and it uses OAuth to authenticate.
– heavyd – 2013-08-06T04:26:23.023@heavyd yeah, I was kinda hoping I would not have to delve into the API for this. If that's the only way I guess I'll have to. From a cursory glance, it does not appear as though I can automate the login process through the API though. Do you know if I can authenticate in a way that requires no active input from me? If I understand the docs correctly to get data that requires authentication I will need to manually log in.
– terdon – 2013-08-06T13:19:47.833I haven't actually used the SE API, but in other OAuth implementations I've used you login once and you're given a token which is good indefinitely. – heavyd – 2013-08-06T14:26:03.760