Send HTTP request to website with password and username, then record results

It really depends on how easy/complex the information that is represented in the web page is. If it's something that can be grepped out, then you could use the SO answer here (from the comment above). However, if it's not something that can be easily grepped out, then you could write a Python script that can easily do this for you. You would need to use urllib2 and cookiejar, and then use something like lxml and BeautifulSoup to parse out the HTML. The SO answer here is an excellent guide on how you could potentially login. For ease, I'm going to copy paste the code here:

import cookielib
import urllib
import urllib2
from BeautifulSoup import BeautifulSoup #you can also use lxml, if you wanted.

# Store the cookies and create an opener that will hold them
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))

# Add our headers
opener.addheaders = [('User-agent', 'RedditTesting')]

# Install our opener (note that this changes the global opener to the one
# we just made, but you can also just call opener.open() if you want)
urllib2.install_opener(opener)

# The action/ target from the form
authentication_url = 'https://ssl.reddit.com/post/login'

# Input parameters we are going to send
payload = {
  'op': 'login-main',
  'user': '<username>',
  'passwd': '<password>'
  }

# Use urllib to encode the payload
data = urllib.urlencode(payload)

# Build our Request object (supplying 'data' makes it a POST)
req = urllib2.Request(authentication_url, data)

# Make the request and read the response
resp = urllib2.urlopen(req)
contents = resp.read()

# parse the page using BeautifulSoup. You'll have to look at the DOM
# structure to do this correctly, but there are resources all over the
# place that makes this really easy.
soup = BeatifulSoup(contents)
myTag = soup.find("<sometag>")

You can then run this every X number of minutes, or you could use Python itself to time the execution of the above function every X minutes, and post/email the results. Depending on what you're trying to do, it might be overkill, but when I've needed to do something similar in the past, this is the route I've taken.

Karthik Rangarajan

Posted 2013-12-25T22:31:34.920

Reputation: 181

Would a div tag in a HTML structure be easily grepped out? – hichris123 – 2013-12-26T02:22:00.753

Yes, it shouldn't be hard. It makes it easier if the div has an ID or similar unique characteristic. At that point, you would do something like soup.find("div", {"id": "uniqueid"}), and it would find the exact div you want. – Karthik Rangarajan – 2013-12-26T02:24:39.330

Does it use cookies or do you need to login every time? – Thomas Weller – 2013-12-25T23:03:21.167

@ThomasW. If I click a Remember Me button when logging in, yes, it does since it automatically has me logged in. – hichris123 – 2013-12-25T23:08:35.183

There's a good answer for this question here: http://stackoverflow.com/questions/1324421/how-to-get-past-the-login-page-with-wget

– sahmeepee – 2013-12-25T23:38:15.620

Send HTTP request to website with password and username, then record results

Answers