5

The context is network based load balancers. It would be nice if someone could explain how TCP ACK storms are generated in real life and practical mitigation strategies for them.

Edit: Clarification on "mitigation strategies". The platform consists of nginx web servers/load balancers running on Linux. So would appreciate the relevant switch to toggle (if any) to thwart it.

DeepSpace101
  • 2,143
  • 3
  • 22
  • 35

1 Answers1

10

When a client and a server connect to each other, they know the connection by its source and address IP addresses, and its port numbers (on client and on server). There are two sequence numbers for both directions of the connection flow; their initial values are chosen randomly. Whenever one of the parties sends a packet, that packet may talk about both sequence numbers: a packet from the client may include information about the sequence number for the client-to-server flow (e.g. a "PUSH") but also information about the sequence number for the server-to-client flow (an "ACK").

If the attacker can make it so that the client and the server disagree by a wide margin on both sequence numbers, then a ACK storm may occur. Basically, the client sends a packet to the server with a server-to-client value that the server finds ludicrously off-the-mark. The server responds with a packet which says "dear client, you are sorely mistaken about the current server-to-client sequence number, pray be so kind as to amend your ways". However (that's the nifty point), that packet will also include an ACK from the server which documents the server's notion of the current client-to-server sequence number... that the client will find preposterously invalid. This will prompt the client to respond with a packet that says "dear server, you are sorely mistaken about the current client-to-server sequence number, pray be so kind as to amend your ways". That packet will also contain an ACK. And the loop goes on...

The storm rages until one of the ACK packets is lost, ending the ping-pong match. However, it will start again the next time the client or the server has anything to say, because they are still desynchronized.

ACK storms thus rely on the ability to desynchronize the client and the server for both sequence numbers. This can be done with semi-MitM abilities: e.g., the attacker can observe the packets between client and server, and also inject some fake packets, but not block existing packets. This is the model of "the attacker is on the same LAN, and the switch is in fact a hub (or has been demoted to hub-like behaviour after having been spammed to death with random junk packets from the attacker)". When the attacker sees a connection from client to server, he immediately sends a fake RST packet (as if the client had closed the connection) then a new fake SYN purportedly originating from the client, with the same port numbers, but widely different sequence numbers.

See this presentation for some details.

Defence is relatively easy: either of the systems engaged in a ACK storm can detect that it receives and sends inordinate amounts of error packets for a given connection, and simply drop it. The ACK storm can keep on only as long as both systems are ready to respond an ACK for an ACK; at the first lost packet, the storm ceases. If a storm is detected, then: 1. the system who detects the storm can stop it by refusing to participate to it any longer, and 2. the connection being unsalvageable, it can be unceremoniously dropped.

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • 1
    Thanks. Is there a way to configure an nginx web server or the OS (linux) to step away from fanning that ACK storm? – DeepSpace101 Feb 03 '14 at 18:36