0

I would like to have a reverse proxy (apache httpd with mod_proxy) in front of an IIS hosting a SOAP Web Service.

The problem I'm facing is that the SOAP clients ask the web server for the details about the web service by downloading the WSDL from the endpoint. This WSDL is generated by the web service and it contains the URLs to the web service that the client should use. The problem is that when the web service is behind the proxy, the generated URLs contain the wrong, private, address.

the IIS web service is located at http://internal.host.com/Dirname/Service.asmx and it can be called with GET or POST.

The WSDL is retrieved with GET with WSDL as the query string:

http://internal.host.com/Dirname/Service.asmx?WSDL

The reverse proxy presents the web service as:

https://proxy.host.com/VirtualDir/Service.asmx

And my problem is that the contents of the dynamically generated WSDL contains the internal URLs (that the proxy connects to).

I would like to avoid hacking/recompiling the web service itself, so when I was thinking of alternative solutions, I wondered this;

Could I make httpd on the proxy server somehow intercept the call to the ?WSDL document and serve static content instead, and still forward other queries (including GET parameters) to the internal IIS server?

The relevant httpd config currently looks like this:

ProxyPass "/VirtualDir/"  "http://internal.host.com/Dirname/"

And I was thinking that perhaps RewriteCond and RewriteRule could be used in some clever way to catch only the requests to the /.../Service.asmx?WSDL and serve a static local document instead, and forward "the rest" to the IIS, but I don't really know how to do that correctly without breaking anything else.

The reverse proxy is used for other services as well, under other virtual "directories".

MattBianco
  • 587
  • 1
  • 6
  • 23
  • Are the internal URLs in the dynamically generated WSDL file significant in some way? i.e. Do they need correcting to use "proxy.host.com"? – Unbeliever Sep 26 '16 at 13:46
  • Yes, I believe it works like this: Client grabs WSDL, looks inside it for what addresses to use for further communication. The problem is that the URLs it contains are generated based on the address used when the proxy queries the backend web server. – MattBianco Sep 26 '16 at 14:00

1 Answers1

0

I've worked around the problem for the time being, by creating a static WSDL specification, containing the proper (proxied) addresses to the service methods, and tell the clients to use that one instead.

It seems to work just fine, but it will have to be updated each time the WebService interface changes.

I've set up a static file on the proxy apache http on this address:

https://proxy.host.com/other_static_dir/Service.wsdl

That contains the correct WSDL specification with the public URIs to use. I downloaded it from the IIS with wget and just fixed it with a text editor.

MattBianco
  • 587
  • 1
  • 6
  • 23
  • mod_rewrite cannot modify the contents of a page. For that you need mod_sed or mod_substitute, so you need to use them if you wish to do it dynamically. It's possible you could use mod_proxy_html to do it, but I've never tried it with XML – Unbeliever Sep 26 '16 at 14:10
  • 1
    You can use a cron job that once in a while (you choose the frequency) grabs the wsdl from the BE and replaces the internal IP with the external FQDN. – Fredi Sep 26 '16 at 14:11
  • I'm not good at writing concise problem descriptions. What I really wanted to do was to intercept the GET request for the original (dynamic) WSDL and rewrite it to serve the static file, so it would be 100% transparent to the end users. (the service name with ?WSDL added to the end of the URL) – MattBianco Sep 26 '16 at 14:40
  • 1
    You could get that effect with mod_sed/mod_substitute/mod_proxy_html (to modify the content) and then mod_cache with a long expiry time to cache the file locally on apache – Unbeliever Sep 26 '16 at 19:03
  • perhaps a `ProxyPreserveHost on` could have been enough. – ezra-s Sep 28 '16 at 23:34
  • `ProxyPreserveHost on` sounds good, provided that the backend doesn't care about the `Host:` header, which it would in case of name based virtual hosting. – MattBianco Sep 29 '16 at 07:12