15

Running tcpdump on local connections to an apache server, I found TCP connections being established and closed immediately every 2 seconds. How do I find which process is responsible for these? netstat -ctp did not help, the connections were too fast and the process identifier is not displayed for TIME_WAIT ones.

They turned out to be haproxy probes, which I could verify with strace, but I still do not know any way to pinpoint haproxy in the first place.

pmezard
  • 253
  • 1
  • 2
  • 4

3 Answers3

21

You can use the auditd framework for these kind of things. They're not very "user friendly" or intuitive, so requires a little bit of digging around on your part.

First make sure you have auditd installed, running and that your kernel supports it.
For Ubuntu you can install it with apt-get install auditd for example.

Then you add a policy for audit to monitor all connect syscalls like this:

auditctl -a exit,always -F arch=b64 -S connect -k MYCONNECT

If you are using a 32-bit installation of Linux you have to change b64 to b32.

This command will insert a policy to the audit framework, and any connect() syscalls will now be logged to your audit logfiles (usually /var/log/audit/audit.log) for you to look at.

For example, a connection with netcat to news.ycombinator.com port 80 will result in something like this:

type=SYSCALL msg=audit(1326872512.453:12752): arch=c000003e syscall=42 success=no exit=-115 a0=3 a1=24e8fa0 a2=10 a3=7fff07a44cd0 items=0 ppid=5675 pid=7270 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts4 ses=4294967295 comm="nc" exe="/bin/nc.openbsd" key="MYCONNECT"
type=SOCKADDR msg=audit(1326872512.453:12752): saddr=02000050AE84E16A0000000000000000

Here you can see that the /bin/nc.openbsd application initiated a connect() call, if you get lots of connect calls and only want to grep out a certain ip or port you have to do some conversion. The SOCKADDR line contains a saddr argument, it begins with 0200 followed by the port number in hexadecimal (0050) which means 80, and then the IP in hex (AE84E16A) which is news.ycombinator.com's IP of 174.132.225.106.

The audit framework can generate a lot of logs, so remember to disable it when you've accomplished your mission. To disable the above policy, simply replace -a with -d as such:

auditctl -d exit,always -F arch=b64 -S connect -k MYCONNECT

Good documentation on the auditd framework:
http://doc.opensuse.org/products/draft/SLES/SLES-security_sd_draft/part.audit.html

Convert IP adresses to/from hex, dec, binary, etc at:
http://www.kloth.net/services/iplocate.php

General hex/dec converter:
http://www.statman.info/conversions/hexadecimal.html

A Brief Introduction to auditd, from the IT Security Stack Exchange. http://security.blogoverflow.com/2013/01/a-brief-introduction-to-auditd/

Edit 1:
Another quick'n'dirty (swedish: fulhack) way to do it is to create a fast loop that dumps the connection data to you, like this:

while true;do
  ss -ntap -o state established '( dport = :80 )'
  sleep 1
done

This command uses the ss command (socket statistics) to dump current established connections to port 80 including what process initiated it. If its a lot of data you can add | tee /tmp/output after done to both show the output on the screen aswell as write it to /tmp/output for later processing/digging. If it doesn't catch the quick haproxy connection, please try removing sleep 1 but be cautious of extensive logging if its a heavily utilized machine. Modify as needed!

pcarvalho
  • 103
  • 4
Mattias Ahnberg
  • 4,039
  • 18
  • 19
  • Thanks you for the detailed response. I will take your word for the auditd solution as the host kernel does not support it and I do not have the time now to find one suitable for experimentation, but I will keep that in mind. As for the polling solution, I started doing something similar with lsof but stopped pretty quickly as it was not... satisfying. – pmezard Jan 22 '12 at 10:53
  • 2
    You can also use `ausearch -i` to have those `saddr` hex strings decoded automatically for your. – sch Sep 02 '14 at 11:37
  • ss is more satisfying than lsof because it's faster and it has good filtering rules - no need for grep. I can appreciate the problems with support: Systemtap is another tool that is superb but getting it to run on a production server can be ... not satisfying. – Max Murphy Nov 21 '15 at 19:31
3

Actually a lot have changed since this question was asked. Most modern linux systems have advanced tracing capabilities which could be used for that purpose.

For example: bcc (some distros call it bpfcc-tools) have tcpconnect utility, which is doing exactly that. Here's a snippet from an official example:

TIME(s)  PID    COMM         IP SADDR            DADDR            DPORT
31.871   2482   local_agent  4  10.103.219.236   10.251.148.38    7001
31.874   2482   local_agent  4  10.103.219.236   10.101.3.132     7001
31.878   2482   local_agent  4  10.103.219.236   10.171.133.98    7101
90.917   2482   local_agent  4  10.103.219.236   10.251.148.38    7001
90.928   2482   local_agent  4  10.103.219.236   10.102.64.230    7001
90.938   2482   local_agent  4  10.103.219.236   10.115.167.169   7101

Another possibility, is bpftrace utility, which have similar tcpconnect tool.

Or you can even use plain ftrace (but in that case you will have to make script that will decode sockaddr struct or do it manually). For example:

# Enable probe
echo 'p:tcp/connect tcp_connect sock=+0(%di):x8[32] prog=$comm' > /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/tcp/connect/enable

# Wait time till connect would be called by apps and check trace buffer
cat /sys/kernel/debug/tracing/trace # note sockaddr data will be encoded here
# disable tracepoint when it's done
echo 0 > /sys/kernel/debug/tracing/events/tcp/connect/enable
echo '-:tcp/connect' >  /sys/kernel/debug/tracing/kprobe_events

Note: in some cases you might need to mount debugfs/tracefs.

1

You can also grep the huge logs you get from "ausearch -i" to see only those sockets that successfully connected to another host on the internet. I wrote a simplistic script to get each process and command that created a socket to connect to a host on the internet along with the connection address of that target host and the current time that the socket was "created". Here it is:

#!/bin/bash

if [[ $EUID -ne 0 ]]; then

    echo "You must run this script as root boy!"
    exit 1  

fi

> proccessConnections.dat

connections=`ausearch -i | grep host: | awk -F "msg=audit" '{print $2}' | awk -F ": saddr" '{print $1}'`

connectionsNumber=`echo "$connections" | wc -l`

echo "Number of connections: $connectionsNumber"

echo "$connections" > conTemp.dat

let counter=1
while read connectInfo; do

    success=`ausearch -i | grep "$connectInfo" | grep "type=SYSCALL" | grep success=yes`    
    addressInfo=`ausearch -i | grep "$connectInfo" | grep type=SOCKADDR | awk -F ': ' '{print $2}'`
    processInfo=`ausearch -i | grep "$connectInfo" | grep "type=SYSCALL" | awk -F 'comm=' '{print $2}' | awk -F 'key' '{print $1}'` 

    if [[ $success != "" ]]
    then    
        echo "[$counter - $connectionsNumber] (success)     comm=$processInfo - $addressInfo - $connectInfo"
        echo "[$counter - $connectionsNumber] (success)     comm=$processInfo - $addressInfo - $connectInfo" >> proccessConnections.dat
    else
        echo "[$counter - $connectionsNumber] (no success)  comm=$processInfo - $addressInfo - $connectInfo"
        echo "[$counter - $connectionsNumber] (no success)  comm=$processInfo - $addressInfo - $connectInfo" >> proccessConnections.dat
    fi

    let counter++


done < conTemp.dat