Nextgen firewalls - encrypted traffic inspection

Question

I read recently about next generation firewalls that use deep-packet-inspection, intrusion-prevention and something the manufacturers call encrypted-traffic-inspection, encrypted-traffic-analytics.

The manufacturers claim the encrypted traffic inspection/analysis in nextgen firewalls is made without decrypting the traffic.

Can we explain intuitively how these appliances work and what are the main points to consider from a cryptographic perspective?

There are many many question about this topic already. Please study first what was already asked and answered and then focus your question on the part which you feel not adequately answered, while explaining what exactly you are missing in the existing questions and answers. — Steffen Ullrich, Jul 11 '21 at 06:06
You question can be interpreted in two ways: analyses of encrypted traffic with decryption and analysis without decryption. This open interpretation can also be seen by the kind of answers you got and accepted, so it was not *"definitely out of the question"* for others reading your question. If you want to limit the question to analysis without decryption (i.e. passive traffic inspection) then please make this clear in your question. Note though that there are some questions for this here to, like [Passive fingerprinting of HTTPS client](https://security.stackexchange.com/questions/127268/). — Steffen Ullrich, Jul 11 '21 at 08:06
The answer I accepted touches in the last paragraph the way inspection can be done in nextgen firewalls. Mostly based on product information from oem manufacturers it seems like they are using machine learning to identify threats. This is how generally speaking the inspection of encrypted(not decrypted) traffic should be done in these appliances. — Roman Gherta, Jul 11 '21 at 08:43
The answer you accepted is about decrypting the traffic. Your actions and your desired outcomes are confusing and in conflict. — schroeder, Jul 11 '21 at 09:47
Can you provide a reference where a manufacturer claims inspection without decryption? And can you explain how one inspects encrypted traffic from a "cryptographic perspective"? — schroeder, Jul 11 '21 at 09:50
I am an expert in NGFWs and ML analytics. I've also been in the infosec industry for almost 20 years, If I am confused, perhaps the confusion is with your wording. I'm am asking for clarity. That should not result in snide comments from you. — schroeder, Jul 11 '21 at 09:52
I have read this term only recently while refreshing my memory with some documentation for ccna 200-300 . I do not want to promote brands here but this is a new trend of the industry and I believe we must explain these "hypie" words that might confuse a lot of people. And I do not claim to know the answer but rather ask the opinion of more experienced members. I wrote the first paragraph exactly as it appears there and it also was written by a guy with dozen of years of experience in networking. The IT world is ever changing and we must continuosly re-learn and readapt with a bit of humility. — Roman Gherta, Jul 11 '21 at 10:00
Which paragraph appeared where? And I'm not asking you to promote a product but to provide a source so that we are all on the same page. Are you talking about this https://www.gartner.com/imagesrv/media-products/pdf/radware/Radware-1-2Y7FR0I.pdf ? — schroeder, Jul 11 '21 at 18:03

Steffen Ullrich · Accepted Answer · 2021-07-11T09:48:33.693

The manufacturers claim the encrypted traffic inspection/analysis in nextgen firewalls is made without decrypting the traffic.

While some analysis can be done without decryption, it is fairly limited compared to full decryption. But since in some cases full decryption is not possible (privacy reasons, certificate pinning, client certificates, ... - or simply only passive analysis inside an IDS) some systems offer analysis without decryption on top of it. Such analysis is restricted to metadata contained in the initial handshake and to traffic pattern, i.e. timing, direction and size of packets.

Traffic pattern are often combined with some kind of statistics or more or less sophisticated machine learning in order to detect the kind of traffic. This can be used for example to detect DNS over TLS or HTTPS or streaming video, but can highlight also unusual traffic (small request, small response and then connection close) which might be associated with C2 communication. In general such methods have a significant false positive and negative rate, so they can be used as interesting indicators but not to retrieve 100% reliable information.

Metadata of the initial handshake can be used in heuristics to defer the kind of client (JA3 fingerprint) or server (JA3S fingerprint) based on offered or supported ciphers, protocols, TLS extensions, ... . Especially the client fingerprint is interesting, since it can often be used to distinguish normal browsers from other clients like malware traffic. This is because browsers often use a different TLS stack or different settings in the TLS stack as other clients. But there is nothing inherently magic here, so malware might also generate the same fingerprint as browsers. But if combined with traffic pattern it can result in a more reliable detection.

Inside the TLS ClientHello there is also usually the server name in clear (SNI) which can be used to compare against block lists. With ESNI though this information is slowly vanishing.

Even faster vanishing is access to the certificate. Up to TLS 1.2 the server certificate was sent in clear and it could be easily extracted. It was used in IDS/IPS for example to detect self-signed or unusual certificates which were in the past often associated with C2 communication. With TLS 1.3 the certificates are encrypted though. Also malicious sites today today also often use the free certificates offered by Let's Encrypt and others, so the value of having access to the certificate decreased.

For a deeper and much longer analysis see Encrypted Traffic Analysis - Use Cases & Security Challenges from ENISA.

I was thinking they are analyzing the encrypted traffic with ML and have a high level of confidence which would be in contradiction with the randomness of the crypto algorithms. This is why I posted the question in the cryptography site. But now I see it is all about patterns in meta. — Roman Gherta, Jul 11 '21 at 10:44
@RomanGherta: Correct, no ML can extract information from properly encrypted information itself. Its just statistics, not magic. — Steffen Ullrich, Jul 11 '21 at 10:46

Maarten Bodewes · Answer 2 · 2021-07-11T10:11:34.980

These encrypted-traffic-inspection authenticate as if they are the website you are trying to visit. They can do this because they use a server certificate (and private key) signed by their own CA / PKI, of which the root certificate which is trusted by your browser / applications.

These trusted (root) certificates are generally shared through e.g. Active Directory group policies controlled by an admin, which pushes them into the Windows certificate store. This certificate store is also used by e.g. Chrome on Windows, although Google is thinking about using their own. In that case I presume some kind of service or application needs to configure the certificates in the browsers.

Finally they act as a client to the other system, operating as a man-in-the-middle, inspecting the (plaintext) data in the connection before forwarding it.

If no MitM configuration is possible then other properties may still be studied. It is possible to also inspect meta-data (as presented in the certificates within the header), the protocol used as well as side channel data such as packet / stream size and timing information. It is known that it is even possible to decode speech that way for specific codecs.

However, it is unknown to me how advanced these kind of techniques are and what their success rates is. It is also be possible for e.g. malware writers to make their communication harder to detect in response, triggering another arms race.

Generally the use of PKI is handled here, and we get the ones about the algorithms etc. in return. For this kind of thing you need knowledge about next generation firewalls, which is not really part of a cryptographers toolbox. Interesting about looking at "shadows" there, interesting article. But I guess that there is no need for that kind of **heuristics** if you can actually decrypt. If the term *encrypted traffic analysis* is specific to ciphertext/TLS protocol analysis I would like to know as well. — Maarten Bodewes, Jul 11 '21 at 01:01
added the term encrypted-traffic-analytics to the list. But it sounds like hype and the process seems like inspection. — Roman Gherta, Jul 11 '21 at 01:37

score 1 · Answer 3 · answered Jul 11 '21 at 01:06

The encrypted traffic inspection, or TLS interception, can occur because they present a certificate that is trusted by the client. This is usually because in a corporate environment, the network administrators have installed a root certificate into the list of authorities trusted by the client. The certificate presented to the end user is generated on the fly and the client essentially sets up a TLS connection to the middlebox. The middlebox then sets up a connection to the real server and proxies the data to the real server, inspecting it, and possibly modifying it.

With a modern version of TLS (1.2 or newer), it isn't possible to intercept data unless the TLS middlebox is trusted in this way; the protocol would be insecure if it allowed this otherwise.

A TLS middlebox in this approach can be a physical firewall device or a proxy, and additionally some antivirus and firewall programs, mostly on Windows, implement this functionality.

Note that cryptographic research has found that many TLS middleboxes contain security problems. For example, they may only support older, less secure algorithms or parameters; they may fail to validate certificates correctly or at all; they may not support the latest version of TLS; or they may fail to implement security-relevant extensions, like encrypted ClientHello. This is in addition to various cases where they are known not to implement HTTP properly and therefore break tools relying on it, like Git. Thus, unless you are sure your implementation doesn't suffer from any of these flaws, deploying such a middlebox is probably unwise.

It is possible to do some analysis on encrypted payloads. TLS supports padding, but it is less common with modern AEAD algorithms, so most connections leak some amount of information about the data being transferred. If the ClientHello is not encrypted, it's possible to receive the server name from the SNI extension, as well as information about the inner protocol via ALPN. These are used in some countries to implement censorship. Additionally, timing data is possible as well, but that is not always interesting. Provided you cannot decrypt the data, though, you cannot see what the actual content is.

Nextgen firewalls - encrypted traffic inspection

3 Answers3