Extract IP addresses in each line of file in shell

-1

1

I have this input file in ubuntu:

146.14.142.96.17747 197.102.40.184.13748:
146.14.142.96.17747 197.102.40.184.13749: 
146.14.142.96.17747 197.102.40.184.13750:
146.114.142.96.17747 197.102.40.184.13751:
46.14.142.96.17747 197.102.40.184.13752:

and I'd like to have the output like this using shell scripting:

separate two IPs without ports number I mean delete ports

146.14.142.96 197.102.40.184
146.14.142.96 197.102.40.184 
146.14.142.96 197.102.40.184
146.114.142.96 197.102.40.184
46.14.142.96 197.102.40.184

Arash

Posted 2013-04-25T16:49:26.723

Reputation: 678

In linux, use "sed" tool with a grep.

– Diogo – 2013-04-25T16:55:06.010

@Diogo tnx. i'm not expert in regular expression – Arash – 2013-04-25T16:56:42.833

Answers

3

For lines formatted exactly as shown in the question, this will do:

sed -E 's/\.[0-9]+[ :]/ /g' input-file

How it works:

  • The -E switch enables Extended Regular Expressions.

  • s/SEARCH/REPLACE/g globally (/g) replaces (s/) SEARCH with REPLACE.

  • \.[0-9]+[ :] matches a dot following any positive number of digits following a space or a colon.

However, this will break if the formatting varies even slightly. This approach may result robuster:

sed -E 's/(([0-9]+\.){3}[0-9]+)[^ ]+/\1/g' input-file

How it works:

  • ([0-9]+\.){3}[0-9]+ matches an IP (three digit groups followed by dots plus an additional digit group).

  • The surrounding parentheses declare the previous match as the first submatch (\1).

  • [^ ]+ matches any non-space character that follows the IP.

Dennis

Posted 2013-04-25T16:49:26.723

Reputation: 42 934

this is not working in sum ips it cuts the ip. – Arash – 2013-04-25T17:16:02.537

181.173.82.61 250.66.33.195 181.173.82.60 250.66.33.195 181.173.82 229.96.193 181.173.83 245.228.178 181.173.82.61 250.66.33.195 181.173.82.60 250.66.33.195 172.30.79 247.236.141 – Arash – 2013-04-25T17:16:44.667

172.30.79 247.236.141 this is not valid ip – Arash – 2013-04-25T17:17:17.700

Look up the line containing the IP starting with 172.30.79. I assume it had a different format. – Dennis – 2013-04-25T17:20:46.117

i min plz im checking it now – Arash – 2013-04-25T17:21:37.560

yeap you right! apologies :) – Arash – 2013-04-25T17:25:32.640

I've added a second approach that should depend a little less on the formatting (works as long as there's a space before the second IP). – Dennis – 2013-04-25T17:29:08.197

could you plz tell me what is / /g is doing? replacing with what? – Arash – 2013-04-25T17:42:50.150

1/ / replaces the first match with a space. / /g replaces all matches with a space. – Dennis – 2013-04-25T17:44:12.950

and 1 more question: how can i using grep to find lines that have these HTTP/1.0 200 OK and HTTP/1.1 OK and ... – Arash – 2013-04-27T18:54:55.390

1Is it intentional that HTTP/1.1 OK doesn't contain the status code? If it isn't grep 'HTTP/1.[01] 200 OK' will do. – Dennis – 2013-04-28T00:12:32.480

0

Do a search and replace using regex

(\d+\.\d+\.\d+\.\d+)\.\d+(:?)

and replace text

\1

Various tools support regex search and replace though the dialect can be slightly different. The above works with Notepad++.

Or in vim you can do

:s/\(\d\+\.\d\+\.\d\+\.\d\+\)\.\d\+\(:\?\)/\1/g

jizugu

Posted 2013-04-25T16:49:26.723

Reputation: 81

this is not working grep '(\d+.\d+.\d+.\d+).\d+(:?)' – Arash – 2013-04-25T16:57:53.320

for 1 ip i use this cut -d "." -f -4 and its working – Arash – 2013-04-25T16:59:55.737

but for 2 IPs i dont know what to do :( – Arash – 2013-04-25T17:00:17.283