Count lines where word in 3rd or 4th column exceeds n characters in text file

0

I have a large text file that is has 4 columns and is space separated.

somelongword otherlongword abcde abc

I would like to count the number of lines where the word in either the 3rd or 4th column is more than n characters long. Eventually I will have many several files to look through, and I would like to print out one total number for each line in all the files.

My intuition is that I should use something like awk but I can't figure out the syntax to do what I want.

RationalHusky

Posted 2016-11-05T22:43:40.793

Reputation: 3

Answers

1

Your intuition is right. There's probably a much simpler way of doing it via sed/awk... but I decided it was time to brush up on my perl and hacked this piece together:

#!/usr/bin/perl
use warnings;
use strict;

my $n = 5;
my $linenum = 1;

while (<>)
{
    my @cols = split(/\s+/);
    if ((length($cols[2]) > $n) || (length($cols[3]) > $n))
    {
        print "Line $linenum: $_";
    }
    $linenum++;
}

It only prints the line-number and line that matches the criteria, but having it print what you want shouldn't take much of a rewrite.

Jarmund

Posted 2016-11-05T22:43:40.793

Reputation: 5 155