I am developping a python program that uses selenium (webdriver python bindings) and PhantomJS (headless WebKit scriptable with a JavaScript API) to load and interact with websites.
When I use this program on a local ubuntu computer/network it loads the websites correctly ; I can dump all their the HTML :
print webdriver.page_source
When I run it on the server, this line only prints
<html><head></head><body></body></html>
It looks like the server answered the request with an empty HTML page.
This issue happens on 2 websites, but the program works correctly for the third website. This makes me think that it is a networking issue more than a programming issue (?). The server is provided by a vps provider.
From the server, I can ping the server of one of the website that answers empty HTML which makes me think that I am not ip blacklisted or banned.
Here is netstat -tulpen output (ran on server) :
tcp 0 0 0.0.0.0:41207 0.0.0.0:* LISTEN 0 267296 22458/phantomjs
tcp 0 0 0.0.0.0:38457 0.0.0.0:* LISTEN 0 267294 22463/phantomjs
tcp 0 0 0.0.0.0:33667 0.0.0.0:* LISTEN 0 267295 22461/phantomjs
I don't know how to debug this / understand what is happening.
Update : After some testing, I made a JS script that directly uses PhantomJS to dump the HTML content of a page and log errors.
It gives
FAIL to load the address Error creating SSL context (error:140A90C4:SSL routines:func(169):reason(196))
So it could be related to PhantomJS or something that blocks it.