How does my HTTPS POST get blocked based on XML content?

Question

There's a web application on a server which I have full access to which accepts POST requests on a REST endpoint. The request payload is expected to be an XML document. For request routing and load balancing the server makes use of HAProxy, which also takes care of the TLS settings and has the certificate chain and private key in its setup.

Recently some posts have been failing. The response has HTTP status code 200 but its payload is an HTML document rather than the expected XML that the service ought to return. It takes this form:

<html>

<head>
    <title>Request Rejected</title>
</head>

<body>The requested URL was rejected. Please consult with your
    administrator.<br><br>Your support ID is: [redacted]<br><br><a href='javascript:history.back();'>[Go
        Back]</a></body>

</html>

What is curious is that this only occurs if the XML has certain character entity references in its contents. I've tested and established these conditions:

The call fails with the above response if entity reference  (carriage return) or 
 (newline) is present in the XML payload, regardless of which or in what combination.
It does not fail if we replace these with decimal instead of hexadecimal references ( and 
).
I see no logging for the call on either the web service or even in the HAproxy logs, while the logging is set up so that if the POST made it to the server we'd expect to see at least something in HAproxy.
If the POST is replicated with curl on the server itself, bypassing HAproxy and external networks, it succeeds.
Trying the call with Postman and the "problematic" XML, I can see that the certificate for the TLS handshake is the one set up in HAproxy, so no replacement from a MITM node.

The first thing I'm wondering, of course, is why on earth a POST would get blocked because of some perfectly fine entity references in it. It's not an XML bomb, the XML is well-formed and I don't see an issue with the HTTPS POST itself. Looking online for that particular HTML response it seems like a firewall is the most likely culprit.

But that raises a second question. How could some node along the network block the POST based on its contents if it's performed over HTTPS? The payload ought to be encrypted. And the TLS connection is established between the client and HAproxy on the server.

EDIT: We contacted the hosting provider who confirmed there was an issue with RPaaS. They've made a change and the requests get through now. Which still leave the question of how a connection over HTTPS could have requests blocked based on content.

Your second question is based on the assumption that the block is done between client and HAproxy. Nothing in the information you provide suggests this and since you are using HTTPS I very much doubt that this is the case. There is likely some component like a WAF between HAproxy and the server or a WAF module used in haproxy. — Steffen Ullrich, Jan 12 '22 at 14:09
@SteffenUllrich See edit. It was in fact something between the client and HAProxy. Also I mentioned that no logging of the blocked requests was found in HAProxy while every other request that arrived at the server did get logged. HAProxy is running in a Docker container as is the web application, they're in the same Docker network. As far as I know there's no WAF module set up in the HAProxy instance. I could certainly be missing something but everything points to the requests being blocked before they ever get to HAProxy. — G_H, Jan 12 '22 at 14:50
Something is likely wrong in your explanation of the setup. Either there is no end-to-end encryption between client and haproxy but the TLS is terminated at the rpaas (and maybe a new HTTPS connection is created between rpaas and haproxy), or the rpaas is between haproxy and server (so it gets decrypted traffic), or the rpaas is actually the same as the haproxy. — Steffen Ullrich, Jan 12 '22 at 14:54
"Your support ID is:" makes me think there's an F5 ASM Web Application Firewall in the middle of your connection. That's their standard text for rejection letters. — gowenfawr, Jan 12 '22 at 15:04
@SteffenUllrich If the RPaaS acted as a man-in-the-middle and created a new HTTPs connection, wouldn't I be able to tell from the certificate? The cert for the connection given to the client did seem to be our own, and as far as I know we didn't provide the host with the private key. I really don't think the RPaaS could be between HAProxy and the service because, as I said, they're both in the same Docker network and HAProxy routes the requests using the Docker container names. I agree that I must be missing something, but so far it eludes me what. — G_H, Jan 12 '22 at 15:14
@gowenfawr F5 ASM does indeed seem to pop up regularly with searches for that response. Hopefully the hosting provider will be able to give some more insight into what's going on and how on earth content from a HTTPS request managed to get blocked. — G_H, Jan 12 '22 at 15:15
I just noticed that the response has a cookie with a name starting with TS01 which also implies an F5 firewall/load balancer. It seems more and more likely that it does act as a MITM. I'll have another close look at that certificate and ask our infra guys if they don't happen to actually have the private key. — G_H, Jan 12 '22 at 15:31
@SteffenUllrich You were correct, I had completely missed a difference in this particular setup from our usual approach. The TLS termination was at the hosting provider. I posted an answer with some pointers in case some other sap runs into this niche problem. — G_H, Jan 12 '22 at 15:45

score 0 · Answer 1 · answered Jan 12 '22 at 15:09

0

It might be due to over accumulating of browsing data on your browser. Try clear up your cache and cookies from your browser and try again.

answered Jan 12 '22 at 15:09

CK Chan

11
2

Seems to occur with requests from browsers sometimes, yes. But I made the request from Postman and made sure to clear all cookies. The issue could also be consistently replicated by others using curl or using Postman on a different machine. – G_H Jan 12 '22 at 15:16

score 0 · Accepted Answer · answered Jan 12 '22 at 15:44

The comments by Steffen Ullrich made me do a second take and indeed, I had missed something crucial in the setup. So far all the services I've maintained had the certificates set up in HAProxy and let it take care of the TLS connection (or redirecting HTTP to HTTPS). In this case, however, HAProxy was set up to simply take HTTP traffic and the TLS connection was handled by the hosting provider, who really had been provided the certificate and key.

So, in retrospect a silly question. But in case it is of some use to someone else experiencing this particular issue, check these things:

Find out if your hosting provider or something along the route that has access to the cleartext traffic is using an F5 ASM firewall.
Check for the presence of cookies in the response that suggest the above (for example a cookie with a name starting with TS).
If the above is the case, check with your provider if their RPaaS policies need to be updated to avoid false flags on XML payloads with hex entity references.
Actually carefully check your setup rather than making assumptions about what you think it is ;)

How does my HTTPS POST get blocked based on XML content?

2 Answers2