Using netcat to log traffic to a file

I currently have a Windows 10 laptop connected to a Ubuntu 12.04 Laptop via Ethernet. All of the traffic from the windows system has been routed to go through the Linux laptop and out the Linux systems wireless interface to the WiFi. I have been tasked with setting up the Linux system to log the traffic. My first thought was to use Wireshark. However, I was asked to have the logs in a specific format. This is the sample format I was given:

GET /
Host: bing.com
Cookie: MUID=0B62F80D880C681C2DB4F14E8C0C6BC5

HTTP/1.0 301 Moved Permanently
Date: Wed, 29 Jun 2016 19:34:28 GMT
Location: http://www.bing.com/
----------------------------------------------------------

I was also initially asked if netcat would be appropriate. So after words, I figured netcat was what seemed appropriate. However, I have never used netcat for logging purposes. I have looked up other methods and have tried methods involving pipes and fifo (both of which I have no knowledge of). I always seem to end up with the same result of a blank output file. Any help would be appreciated. Thanks.

CorruptedOffset

Posted 2016-07-11T18:26:57.933

Reputation: 13

Answers

The "logs" you show are the HTTP request and response. I would not use nc as that seems to be just adding in more complexity.

I would recommend using tcpdump. This basically captures the same data as would wireshark, and you can even parse the dumps with Wireshark. If I'm trying to gather the HTTP requests and response, I like to use the -A option.

This seems to work for me to show these headers:

tcpdump -A -s 8000 '(port 80 or port 443) and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)' | awk --re-interval 'match($0,/^.{8}((GET |HTTP\/|POST |HEAD ).*)/,a){print "\n"a[1];h=2}$0~/^.{0,1}$/&&h{h--}h&&/^[A-Za-z0-9-]+: /{print}' >> your_log_file

The above command should be run as root. If you are running it as another user, of if you only want to capture using a single interface, be sure to add -i <your_device> to tcpdump. For instance:

tcpdump -i eth0 -A -s 8000 '(port 80 or port 443) and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)' | awk --re-interval 'match($0,/^.{8}((GET |HTTP\/|POST |HEAD ).*)/,a){print "\n"a[1];h=2}$0~/^.{0,1}$/&&h{h--}h&&/^[A-Za-z0-9-]+: /{print}' >> your_log_file

DKing

Posted 2016-07-11T18:26:57.933

Reputation: 250

So I copied and pasted your code exactly. Bare in mind that I didn't really understand any part of the commands besides the fact that the first word was the command the rest were parameters and options. I got a response that said: tcpdump: awk: line 1: no suitable device found syntax error at or near , – CorruptedOffset – 2016-07-11T19:21:23.777

Try running this as root so that it can listen to all devices. Otherwise, add in -i <your_device> to the tcpdump to specify the network device. – DKing – 2016-07-11T19:28:14.907

1I typed in tcpdump -i eth0 and start seeing that the traffic was being captured. How would I go to export this to a log file and would it be in the same format. P.S. I was told to make the log file look exactly like the sample format I posted. – CorruptedOffset – 2016-07-11T19:35:23.167

Just use >> your_filename. I'll edit the answer with that in it. – DKing – 2016-07-11T19:40:53.423

Thank you very much for the help. You have helped me a lot. Just as a last question. Any idea on how to get the format? – CorruptedOffset – 2016-07-11T19:55:11.623

I'm sure that I could try to help. I'm not certain what you are asking for, though. The data requested in the HTTP request and response headers. The first lines of each are the important parts, and the other lines give more information. The command is listening for the raw packets (over the HTTP and HTTPS ports), and stripping out the plain text data being sent. What type of formatting are you looking for? – DKing – 2016-07-11T20:20:41.697

Essentially the exact same format as the one in the original post. From my guess, this was the original output from the software used when I was tasked. If it is not possible to output in that exact format, do you know of a way to parse it? – CorruptedOffset – 2016-07-11T20:28:51.453

I'm not certain what the difference would be. I'm seeing the HTTP headers and responses. I'm missing the "-"s at the end, but I'm not sure how helpful that would be, as the requests and responses are not likely to be in order. As for the header fields, for the most part, none are required, and they may come in almost any order. If you only want certain headers or if you want them in a specific order, that would take more work. Unless I'm missing something, anything more than this would be up to a developer, and would be too unique for the purpose of this site. – DKing – 2016-07-11T20:48:46.133

That is what I was tasked with. I was never given a reason to be in that particular format; however, it was made very clear to be in that explicit order. However, you are correct the information is there. – CorruptedOffset – 2016-07-11T20:54:12.613

ngrep provides the fields you are looking for, but you'll need a script to ensure the output is in the correct order (assuming there will be overlapping requests/responses).

http://ngrep.sourceforge.net/usage.html

Example:

[user@host~]$ngrep -W byline -q 'HTTP'

T 172.999.999.999:65535 -> 198.41.209.136:80 [AP]
GET / HTTP/1.1.
User-Agent: Wget/1.17.1 (linux-gnu).
Accept: */*.
Accept-Encoding: identity.
Host: reddit.com.
Connection: Keep-Alive.
.


T 198.41.209.136:80 -> 172.999.999.999:65535 [AP]
HTTP/1.1 301 Moved Permanently.
Date: Sat, 13 Aug 2016 13:55:26 GMT.
Transfer-Encoding: chunked.
Connection: keep-alive.
Set-Cookie: __cfduid=9999999999999999999999999999999999999999; expires=Sun, 13-Aug-17 13:55:26 GMT; path=/; domain=.reddit.com; HttpOnly.
Location: https://www.reddit.com/.
X-Content-Type-Options: nosniff.
Server: cloudflare-nginx.

Jeff S.

Posted 2016-07-11T18:26:57.933

Reputation: 126