2

So here's my problem. I have an issue with a .csv file (Current.csv) in that there are commas randomly place throughout the data, so awk-ing the file doesn't give me constant column numbers for a particular type of information I'm looking for. Luckily, I realized this info is always the third instance of a date format (m,mm)/(d,dd)/yy. So I'm trying the regular expression below to display just the dates within the ith line:

awk -F',' '{if (NR==$i)print}' Current.csv | grep -o "[0-9]{1,2}/[0-9]{1-2}/[0-9]{1,2}" | echo

It doesn't display anything so far and I'm totally stuck as to why. My guess for displaying the third is to just pipe this all into:

awk {print $3}

Any ideas on the awk'd regular expression search problem?

Sample line
"lettershere",numbershere,"retardedbrokenquoteshere,mm/dd/yy,morestuff,mm/dd/yy,numbers,mm/dd/yy

2 Answers2

2

Assuming that the CSV file is valid (i.e. fields containing commas are quoted), you should rather use something which actually parses it as CSV. The following simple Python script will extract the second column of each row.

python -c 'import csv; import sys; [sys.stdout.write(row[1]+"\n") for row in csv.reader(sys.stdin)]'
mgorven
  • 30,036
  • 7
  • 76
  • 121
  • HEYYYYY This is perfect! One slight change is that the second column turns out to be [1], but not like I would have guessed to incorporate Python :P this also clears up other issues I've been having. THANK YOU! :D –  May 17 '12 at 03:55
  • Whoops, yeah. Updated. – mgorven May 17 '12 at 04:08
1

i is not set, so defaults to zero, and in any case you mean i and not $i.

You need grep -E for extended regular expressions.

The {1-2} in the month field should be {1,2} and the forward slashes should be protected by backslashes.

Piping to echo will lose any output as it is not a filter, and is not needed in any case.

As mgorven suggests, use a different approach that handles csv.

ramruma
  • 2,730
  • 1
  • 14
  • 8