Gotchas for reverse proxy setups

Question

We run multiple web applications, some internal-only, some internal/external. I'm putting together a proposal that we use reverse proxy servers to isolate the origin servers, provide SSL termination and (when possible) provide load balancing. For much of our setup, I'm sure it will work nicely, but we do have a few lesser-known proprietary applications that may need special treatment when we move forward with reverse-proxying.

What kinds of traps tend to cause problems when moving an origin server from being on the front lines to being behind a proxy? (For example, I can imagine problems if an application needed to know the IP address of incoming requests.)

score 6 · Accepted Answer · answered Mar 02 '11 at 10:17

The most frequent pitfall are redirections generated in the application that you will have to rewrite in reverse proxy, the client ip address issue you already said, if using ssl termination maybe the server wants to check client certificate or get user information from it. For proper edge-side reverse proxy caching, application modifications may be needed (adecuate expire headers, unset unneeded cookies, etc.). If you are using windows integrated auth it may be unachievable or a true nightmare

Then you could have tunning issues but I think those will be much easier to resolve. My preferred toolset for this task would be:

Nginx in the outer layer for virtual host management and location mapping,compression,ssl, access logging
varnish for caching
haproxy for request queuing, load balancing and backend checking.
If you need high availability for you reverse-proxy box(es) keepalived does the job

score 5 · Answer 2 · answered Mar 02 '11 at 12:57

Problems you can have with applications behind a reverse proxy:

If the language or app uses IP addresses to keep sessions, the only IP address reaching the application will be the proxy ones. Using nginx and varnish you can add the X-Forwarded-for header with the original IP address and make the app recognize it.
Session handling in load balanced environments is tricky. You must use a shared resource for the servers to keep their session information so a logged user can reach any of the backends of a load balanced apps and keep its session. Databases and Memcached are popular choices for session sharing.
If the reverse proxied app uses absolute URLs on their responses, they may break the rewriting on the proxy. I.e. an app that does 302 redirects to an URL different than the proxy one.

I currently use nginx on the frontend and varnish behind it to do proxy and load balancing/backend checking. As a single point of failure, is very important to use a cluster solution on the reverse proxy/load balancer.

I use corosync/pacemaker (Linux HA, most recent version) for load balancing the load balancers: three load balancers, each one with an external IP address, RR balanced using DNS (one name points to the three IP adresses). If one of the machines is down the IP address designated to it is moved by corosync to one of the other two remaining machines. Also if I add more machines/IP addresses they are automatically balanced and if all but one machines are down all IPs will be on the one that is up. You can use corosync to do active-active, active-passive and many other cluster configs.

score 0 · Answer 3 · answered Mar 02 '11 at 07:46

0

A common (And cheap) way to do this is using squid as a reverse proxy ... you should take a look at their FAQ where they discuss common problems .. Squid revers Proxy FAQ

answered Mar 02 '11 at 07:46

trent

3,094
18
17

score 0 · Answer 4 · answered Mar 02 '11 at 11:16

0

You may also face problems with internal redirections: the proxy will make the application "think" it's located in one URL, which obviously is different from the real (external) URL

answered Mar 02 '11 at 11:16

Jesus Cuenca

1
1

score 0 · Answer 5 · answered Mar 02 '11 at 12:27

On Linux/BSD and some other operating systems its quite possible for the proxy to masquerade as the client ip - so you don't lose visibility (this uses the OS's networking functions - iptables / ipfw - not the proxy's).

If you are using client certificate authentication on an of the applications then you'll have difficulty displacing the SSL termination point and maintaining security.

Load balancing can be tricky - for http apps, you've got to either replicate state in real time across the cluster - or use 'sticky sessions' which rather undermines the principle of fault tolerance (a hybrid approach using limited replication/defined session failover is usually the most practical solution).

Gotchas for reverse proxy setups

5 Answers5