Save HTTP body with netcat

As hinted above wget (and lynx... and curl) can do a much better job than netcat but if you insist on it, you can filter out the header with your favourite scripting language. As the http header is terminated by \r\n\r\n which in unix-like systems (I guess that's where you are) actually means "all the header lines plus a line containing only \r", this is not as tough as it seems at first glance.

Using gawk (yes, GNU awk!, as RT is not known by other awk versions, AFAIK), this can be your command:

netcat ... | gawk 'NR==1,/^\r$/ {next} {printf "%s%s",$0,RT}' > something.out

If the question "why not just use print instead of this ugly method?" would pop up, the answer is: we don't know if the last record (what gawk thinks to be a record) is terminated with newline or not, and we also don't have clue if the existence of this last newline is significant or not. We can be sure if we write it there only if it was there in the input. RT will be empty if it was not so the output will be what was sent and not more.

Gombai Sándor

Posted 2016-03-16T11:44:12.117

Reputation: 3 325

Save HTTP body with netcat

Answers