I've got an Ubuntu 12.04 server running on Amazon's EC2 that runs a web crawling process. We're running into a problem where some of the webservers hosting the sites we need to crawl are blocking all EC2 IP addresses.
My brilliant idea was to tunnel outgoing HTTP requests through a VPN. I was able to get the VPN set up but it routed ALL traffic through the VPN which meant that I couldn't SSH into the machine and it wouldn't respond to any incoming http requests. (This server also hosts a web service that we need to be able to access)
Really I just want to "proxy" all outgoing HTTP requests through the VPN so we can access sites that have all EC2 IPs blocked.
It's very possible I'm going about this the wrong way and I welcome any other suggestions that might be simpler or more robust.