2

How Does Application Visibility and Control Work? The application identification (App ID) classification engine and application signature pattern-matching engine operate at Layer 7 and inspect the actual content of the payload for identifying applications. App ID performs a deep packet inspection (DPI) of traffic on the network and on every packet in the flow that passes through the application identification engine until the application is identified. Application findings such as IP addresses, hostnames, and port ranges are saved in the application system cache (ASC) to expedite future identification. -- Juniper

AVC uses stateful deep packet inspection (DPI) to classify more than 1400 applications. It can also combine DPI with techniques such as statistical classification, socket caching, service discovery, auto learning, and DNS-AS. Custom applications can detect native apps. -- Cisco AVC

The inspection of thousands of traffic patterns over several years led Meraki to create a database of traffic signatures that can be used to recognize network traffic at the application level. -- Meraki (Cisco)

AppRF performs deep packet inspection (DPI) of local traffic and detects over 1500 applications on the network. AppRF allows you to configure both application and application category policies within a given user role. WebCC uses a cloud-based service to dynamically determine the types of websites being visited, and their safety -- Aruba (HP)

Different vendors of NG Firewalls perform application control but the technique used isn't documented. All vendors do a quick explanation of how it works but no details are given.

I'm asking if someone knows what happens in the background when no SSL interception is used on the firewall and all traffic is transferred via HTTPS (TLS 1.2) [URL is hidden too].

How does the NGFW identify and see inside Google, Facebook, etc. traffic separating (videos, games, chat, etc.)? One way should be identifying IPs if they use different ranges dedicated to specific services but this technique doesn't offer much granularity. The most interesting part is "traffic patterns". How are they built and what's the fault positive risk to block a valid application that has a "similar" pattern to a famous app?

Steffen Ullrich
  • 184,332
  • 29
  • 363
  • 424
emirjonb
  • 121
  • 5
  • please link to the actual sources – schroeder Feb 22 '18 at 09:21
  • You state that URLs are hidden too. Are you aware that TLS does not encrypt the domain part of the URL? Your use case is when there is no SSL interception, but are you aware that interception is required to perform DPI? Does that change your question? – schroeder Feb 22 '18 at 09:23
  • The url is visible only on the first TCP handshake over 443 that usually is also found on the revere DNS record (star-mini.c10r.facebook.com [157.240.20.35]) but after that no app/service can be identified. I've have some experience with HP(aruba),Cisco(SFR),TippingPoint,Barracuda and saw an accurate application identify without using SSL interception or http proxy – emirjonb Feb 22 '18 at 09:57
  • And the rest of the session traffic becomes easy to identify from the handshake – schroeder Feb 22 '18 at 10:06
  • To perform DPI requires interception. Traffic classification is different. – schroeder Feb 22 '18 at 10:07
  • 1
    @emirjonb: The target hostname is visible in each TLS handshake if SNI is used - which is used by all browsers. With a full TLS handshake additionally the certificate content can be used for classification. Even if SNI would not be used the system might correlate current target IP and earlier DNS lookups which resulted in this IP address. – Steffen Ullrich Feb 22 '18 at 10:08
  • 2
    @emirjonb: I've adapted the title so that it hopefully reflects better what you ask, i.e. it looks like that your question is limited to the application detection within TLS but without TLS interception. – Steffen Ullrich Feb 22 '18 at 10:17

3 Answers3

1

They do it on Certificate basis. They check the data on the certificate exchange between server an client, and base on that, they now if you are trying to access to dropbox, outlook, or any other page. The CN or SAN of Dropbox is www.dropbox.com, so they now you are trying to access it.

In other cases, application control in Youtube, browsing, etc.,is problematic, because is signed with a wildcard: *.google.com. This makes Fws unable to differentiate between browsing or youtube. In the case of gmail, it has its own CN/SAN as gmail.google.com. I don't now if FWs have implemented a parallel mechanism to detect youtube, but not too long ago they were unable. Regards.

bulw4rk
  • 61
  • 4
  • +1. Well, most UTMs do not do this, but this is how they SHOULD do it -- https://www.slideshare.net/AndrewBeard1/detecting-malicious-ssl-certificates-using-bro -- https://www.bro.org/current/solutions/ssl/ -- https://www.bro.org/sphinx/scripts/policy/protocols/ssl/validate-certs.bro.html -- https://github.com/salesforce/ja3 -- https://mpars0ns.github.io/archc0n-2016-tls-slides/ – atdre May 25 '18 at 20:54
0

Because encryption of the SRC, DST, SPRT, DPRT is not encrypted.

The encapsulation process that occurs during the OSI model (i.e. from BGP, TCP, UDP, ARP protocols) does not allow for any key exchange mechanisms to encrypt/decrypt the datagram header which can be easily resolved via DNS, WHOIS or both to a service provider or owner of an IP space.

For information about techniques used for DPI classification you might do well to look at two open source solutions to the vendors mentioned; snort and surricatta

jas-
  • 931
  • 5
  • 9
-1

The identification of the traffic is in general made by using DPI, that is basically check specific parts of the packets for determine and application or a type of traffic. For example HTTP traffic from facebook could be detected as

^(GET|POST).*Host:.*facebook.com

And in some cases by checking the content-type you can guess if the traffic is video/audio or others.

In the case of SSL traffic, the NGFW they just can analyze the client/server and certificate messages and extract the domain name and other information, but is impossible with DPI to guess what is inside. However, by using traffic metrics you can guess it. For example a big download of SSL from a site could be a video but could be a download of a file, but this is a guess

camp0
  • 2,172
  • 1
  • 10
  • 10