3

I'm at a bit of a loss.

First some context: I've got an AWS EC2 Instance behind an NLB. The NLB is using an Elastic IP. The EC2 Instance is running a DNS server and listening on UDP and TCP 53. The NLB is setup for TCP and UDP port 53. The instance is in a Target Group and healthy in the eyes of the NLB (and serving requests as expected).

Problem I'm trying to solve: I want to ensure I drop all DNS queries for record type ANY (as well as a few other rules to rate limit and filter) so I've added the following iptables rules:

$ iptables -t raw -I PREROUTING -p udp --dport 53 -m string \
    --hex-string "|0000FF0001|" --algo bm --from 40 -j DROP

$ iptables -t raw -I PREROUTING -p tcp --dport 53 -m string \
    --hex-string "|0000FF0001|" --algo bm --from 52 -j DROP

$ iptables -t raw -I PREROUTING -p udp --dport 53 -m string \
    --hex-string "|0000FF0001|" --algo bm --from 40 -j LOG \
    --log-prefix "BLOCKED ANY: "

$ iptables -t raw -I PREROUTING -p tcp --dport 53 -m string \
    --hex-string "|0000FF0001|" --algo bm --from 52 -j LOG \
    --log-prefix "BLOCKED ANY: "

Now for the problem...

If I try dig some.domain -t any @public.ip.of.instance my query is blocked and I see the log entry in /var/log/kern.log as expected.

If I try dig some.domain -t any @elastic.ip.on.nlb the request is not blocked and I get a response. No log entry in kern.log.

The weirdest part for me is that I tried taking the NLB out of the picture and assigned the same Elastic IP to the instance directly. Same result - the ANY query sent to the EIP is not dropped even with the above iptables rules in place. The same ANY query sent from another instance using the private IP instead of the EIP is dropped as expected.

I've tried the same rules in the nat (also using the PREROUTING chain) and filter (using the INPUT chain) tables. Am I missing something obvious in my iptables rules?

Any other ideas?

slm
  • 7,355
  • 16
  • 54
  • 72
seajoshc
  • 61
  • 1
  • 5
  • 1
    Try to capture the packets in both cases with `tcpdump` (cli) or `wireshark` (gui) and compare them. Does the DNS payload differ? Do any offsets perhaps differ? BTW you can capture the DNS traffic on the instance and save it to *pcap file* using `tcpdump -w dump.pcap -s 0 -nn ... port 53` and then analyse the pcap file in `wireshark` on your desktop. Let us know what you find :) – MLu Jan 31 '20 at 10:44

2 Answers2

2

Looking around ServerFault I found this answer - iptables drop packet by hex string match which shows spaces between the hex values, I'd suggest trying that:

Example from that question:

$ iptables --append INPUT --match string --algo kmp \
    --hex-string '|f4 6d 04 25 b2 02 00 0a|' --jump ACCEPT

So change up your examples like so:

$ iptables -t raw -I PREROUTING -p udp --dport 53 -m string \
    --hex-string "|00 00 FF 00 01|" --algo bm --from 40 -j DROP
slm
  • 7,355
  • 16
  • 54
  • 72
0

Ok well after many hours of troubleshooting it looks like the short answer is that it was working all along...

I ran tcpdump on the EC2 Instance behind the NLB (tcpdump udp port 53 -X -nn). Then from my Macbook (Catalina 10.15.2) I ran dig some.domain -t any @elastic.ip.on.nlb and not only did I get a response but the query never even showed up in the packet capture on the EC2 Instance. There is only one EC2 Instance behind the NLB using the Elastic IP I used in the dig query. Being thoroughly weirded out, I then ran the same dig command on an Ubuntu machine and also on a Windows 10 computer. Both of those queries timed out (filtered properly by iptables), I saw them in the tcpdump, and the log message was in /var/log/kern.log as expected. I go back to my Macbook run the same dig command and it still returns an answer and nothing in tcpdump... wtf!

I rebooted my Macbook, checked for the one millionth time I was using the right IPs and the same query, tried different domains, and probably a hundred other things. I am at a total loss why this seems to return a response with nothing in the packet capture coming from my Macbook.

So ultimately it seems like a strangely isolated issue (maybe some Apple lameness going on...) and not some weird packet mangling being done by AWS or a busted iptables rule like I initially thought. So the real answer is: try it out on multiple machines before posting to StackExchange.

EDIT: To clarify. I do see the query in tcpdump from my Macbook (and it times out as expected) if I use the public IP of the instance and not the EIP. It's only with the EIP I do not see the query and it returns a response...

Also, I'm not sure if this should be an answer post or if I should have just amended my initial post. Mods do with it what you will!

seajoshc
  • 61
  • 1
  • 5