awk search in specific columns

We have a file like this (with many more lines):

BeginJobID=S0065546 JESMSGLG(1/281) jname=CICWCMWD queue=EXECUTION JESMSGLG(2/281) BeginJobID=S0065568 jname=CICWWUWD queue=EXECUTION JESMSGLG(3/281) jname=CICWMCWD BeginJobID=S0065569 queue=EXECUTION JESMSGLG(4/281) jname=CICWTQ11 queue=EXECUTION BeginJobID=S0065599 BeginJobID=S0065600 JESMSGLG(5/281) queue=EXECUTION jname=CICWFA11 JESMSGLG(6/281) jname=CICWFA21 BeginJobID=S0065601 queue=EXECUTION JESMSGLG(7/281) jname=CICWFY11 BeginJobID=S0065602 queue=EXECUTION BeginJobID=S0065603 JESMSGLG(8/281) jname=CICWFY21 queue=EXECUTION BeginJobID=S0065604 JESMSGLG(9/281) jname=CICWFQ11 queue=EXECUTION BeginJobID=S0065605 JESMSGLG(10/281) queue=EXECUTION jname=CICWFT11 JESMSGLG(11/281) jname=CICWFT21 queue=EXECUTION BeginJobID=S0065606 JESMSGLG(12/281) jname=CICWFT31 queue=EXECUTION BeginJobID=S0065607 JESMSGLG(13/281) jname=CICWFT41 queue=EXECUTION BeginJobID=S0065608 BeginJobID=S0065609 JESMSGLG(14/281) jname=CICWGA11 queue=EXECUTION BeginJobID=S0065612 JESMSGLG(15/281) jname=CICWGA21 queue=EXECUTION JESMSGLG(16/281) BeginJobID=S0065613 jname=CICWGQ11 queue=EXECUTION BeginJobID=S0065614 JESMSGLG(17/281) queue=EXECUTION jname=CICWGY11 BeginJobID=S0065615 JESMSGLG(18/281) jname=CICWGT21 queue=EXECUTION BeginJobID=S0065616 JESMSGLG(19/281) jname=CICWTT41 queue=EXECUTION JESMSGLG(20/281) BeginJobID=S0065617 jname=CICWGT11 queue=EXECUTION

I would like to know an awk simple command to make 2 reports like these:

executing: awk_simple_command_(jname=) result:

CICWCMWD
CICWWUWD
CICWMCWD
CICWTQ11
CICWFA11
CICWFA21
CICWFY11
CICWFY21
CICWFQ11
CICWFT11
CICWFT21
CICWFT31
CICWFT41
CICWGA11
CICWGA21
CICWGQ11
CICWGY11
CICWGT21
CICWTT41
CICWGT11

executing: awk_simple_command_(BeginJobID=) result:

Alex Bermudez

Posted 2012-10-01T18:52:07.750

Reputation: 21

Does your data file contain hard line breaks, or are your columns delimited only by a space? – Kenneth Murphy – 2012-10-01T20:03:21.947

Answers

In the case that your input data file contains columns delimited only by spaces, no newlines, here's one way to solve the problem using awk:

reports.awk

BEGIN {
  /* Split records on the space character */
  RS=" ";
  /* Within each record, split the components (fields) on the '=' character */
  FS="=";
}
/* When the first field is the one requested (colname), 
   print the second field. */
$1 == colname { print $2; }

Then, assuming your data file is named "data", you can invoke the program like so:

$ awk -f reports.awk colname=jname data

Of course, using either colname=jname or colname=BeginJobID depending on what data you want to extract. This should produce the output you want.

If your data file sometimes uses a newline in lieu of a space, you'll want to convert those to spaces first and pipe the result into awk:

$ cat data | tr "\n" " " | awk -f reports.awk colname=BeginJobID -

And you can certainly stick that command in a shell script if you'll be using it often.

Kenneth Murphy

Posted 2012-10-01T18:52:07.750

Reputation: 343

very nice solution. – Alex Bermudez – 2012-10-01T23:11:24.277

You might also want to consider using grep:

grep -o 'jname=[^ ]\+' infile | grep -o '[^=]\+$'

And:

grep -o 'BeginJobId=[^ ]\+' infile | grep -o '[^=]\+$'

Thor

Posted 2012-10-01T18:52:07.750

Reputation: 5 178