Script to reference another file first

grep is the command whose primary purpose is to look through text for a pattern (or patterns). Its simplist application to this task would be

grep 1.2.3.4 db

which will output the “IP_address Hostname” line(s) from the db file that match the pattern “1.2.3.4”. Note that I said line(s) (rather than line) and match the pattern (rather than contain the string), because there are a couple of gotchas here.

One is that grep, as I've said, (by default) searches for patterns rather than strings. These patterns are called regular expressions, and there's a world of documentation on them. The most important fact about regexs for this discussion is that a period, “.”, is a pattern that matches any single character. So, for example, the pattern (regex) 1.2.3.4 would match the strings 132.364 and 1a2b3c4. These are probably not an issue as far as the IP_address column is concerned, but such strings might appear in the Hostname column.
Another is that, unless otherwise specified, grep searches for the provided pattern anywhere on the line. So the pattern 1.2.3.4 would match lines containing 31.2.3.4 and 1.2.3.42.

But I believe that you’ll do a good job of getting the line you want – and only that line – with the command

grep "^1.2.3.4[[:space:]]" db

where

“^” is another character that’s special in regular expressions; it means that this pattern must appear at the beginning of a line.
“[[:space:]]” represents a whitespace character; either a space or a tab. If you are sure that your db file uses only spaces, you can just search for a space; e.g.,
```
grep "^1.2.3.4 " db
```

These will still find lines that begin with 132.364 or 1a2b3c4. But, if your db file is properly structured, you should have no such strings at the beginnings of lines in your db file.

You can prevent “.” from matching arbitrary characters with any of the following:

grep "^1\.2\.3\.4[[:space:]]" db
grep "^1\.2\.3\.4 " db
grep -F "1.2.3.4 " db

but the first two will require you to jump through hoops in your script to convert 1.2.3.4 to 1\.2\.3\.4, and the third one takes away the special meaning of “^” (so 31.2.3.4 will still match).

As stated above, these commands all match the entire line from the db file that matches the IP address. To get just the hostname (the second column), use

grep "^1.2.3.4[[:space:]]" db | awk '{print $2}'

This will just write the hostname to the standard output (typically, the screen). A more useful operation in a script would be

db_host=$(grep "^1.2.3.4[[:space:]]" db | awk '{print $2}')

which captures that output into the shell variable db_host.

So, finally, this section of your script might look something like

db_host=$(grep "^$this_IP_address[[:space:]]" db | awk '{print $2}')
if [ "$db_host" = "" ]
then
    Use the host command.
          ︙
fi

which you would put in a loop, wherein the variable this_IP_address takes on each IP address that you want to process. If you have them in a file (i.e., another file, separate from db), then read that file one line at a time.

For example, if you have IP addresses (one per line) in a file called src_ip, and all you want to do is write the hostnames to the standard output, you could say,

while read this_IP_address
do
    db_host=$(grep "^$this_IP_address[[:space:]]" db | awk '{print $2}')
    if [ "$db_host" = "" ]
    then
        # Use the host command to lookup $this_IP_address.
        host "$this_IP_address" | head -n 1 | awk '{print $5}'
    else
        echo "$db_host"
    fi
done < src_ip

grep -f src_ip db wouldn’t be particularly useful for the problem you’re describing (if I understand it correctly) because it would report all the lines in db that match any IP address in src_ip – leaving you to figure out which IP addresses weren’t matched.

A simpler way of doing this has occurred to me:

awk -v this_one="$this_IP_address" '$1 == this_one {print $2}' db

which copies shell variable this_IP_address into awk variable this_one and then says “print the second field of any line whose first field is equal to this_one.” This is one process, as opposed to two, and it eliminates the issue of “.” matching arbitrary characters. Wrap it in db_host=$(…) as before.

G-Man Says 'Reinstate Monica'

Posted 2014-09-18T23:27:10.497

Reputation: 6 509

This is not what I'm looking for. The file needs to reference my 2nd column of the db file. That second column is the hostname. A file of just one column (ip addresses) should reference a file called db which has 2 columns, ip hostname. If there is no match, then each ip in my source file should return the hostname using the host command – unixpipe – 2014-09-19T15:52:08.100

How do I make a variable like $this_IP_address to include every line of a file (in this case, every IP?). Also how would I run the host command? What is the exact command there? since it would need to call a variable to do that. And if I'm using a file, shouldn't I be using grep -f src_ip db | awk '{print $2}') ?? – unixpipe – 2014-09-19T17:59:48.993

@unixpipe: I updated the answer again. … … … … … … … … “How would I run the host command?” Huh? I thought you were already using it – your question says, “I am already doing the latter with host …” Why are you asking how to use it now? – G-Man Says 'Reinstate Monica' – 2014-09-19T18:55:21.913

You misunderstood. In your little script above, how would you make all the ip's in $this_IP_address resolve with the host command? – unixpipe – 2014-09-19T21:00:54.793

I know how to run the host command, but in your script, you need to run it on a variable. Which variable are you using? – unixpipe – 2014-09-19T21:39:29.393

Ok the second script seemed to work but it only outputted what was not found in the db. I just used host $this_IP_address on the section that you were not explicit with the host command. What I need as the final output is basically the ones that are found in the db to be outputted to the screen as well. – unixpipe – 2014-09-19T22:37:48.563

Script to reference another file first

Answers