Bash - Convert all instances of date format dd-mm-yyyy to yyyy-mm-dd

-3

Yes, before someone jumps up and attack me with a pitchfork, this is a duplicate question, but the others don't work for me, so I'm asking it myself now.

I have a CSV file that have dates somewhere in each entry. Added difficulty to the conversion is that sometimes the dates will have single digits for days. Example entries:

abc,0,2,-2,3-16-1994
xyz,1,2,3,10-09-1994

I want something, preferably sed, to convert those data entries to look like this:

abc,0,2,-2,1994-03-16
xyz,1,2,3,1994-09-10

I've tried:

sed 's|(..)-(..)-(....)|\3-\2-\1|'

But that gives an error and it doesn't really cover the single digit day issues.

I also tried:

awk -F - '{print $3$2$1}'

This actually has somewhat of a desired effect, but then again not really. The awk command converts it, but only the month and year, and instead of putting the date back where it was, it put's the month and year of the beginning of the line, leaving the day portion wherever it originally was.

Any help would be awesome!

Thanks in advance.

Edit

It was rightly pointed out in the comments that I made a mistake with my examples. The dates should be:

abc,0,2,-2,16-03-1994
xyz,1,2,3,2-05-1994

With the desired result being:

abc,0,2,-2,1994-03-16
xyz,1,2,3,1994-05-02

Sorry guys.

user1840352

Posted 2017-04-26T21:13:18.233

Reputation: 1

Question was closed 2017-04-27T09:02:43.407

Can you explain what about the other solutions didn't work? Are they just not the final format you want, or are they not appropriate for the input you're dealing with, or something else? – music2myear – 2017-04-26T21:40:51.800

@music2myear I explained what went wrong with the awk solution. The sed command actually threw an error, complaining about something being missing in "RE", an error I've never seen before. – user1840352 – 2017-04-26T21:44:33.647

Have you looked further into that error? – music2myear – 2017-04-26T21:53:51.490

As John C points out in his answer, you say your data are ‘‘dd-mm-yyyy’’, but your example data include ‘‘3-16-1994’’, which means the 3rd day of the 16th month — so your question is flawed.

– Scott – 2017-04-27T01:46:37.497

@Scott I fixed my question with an edit at the bottom. Sorry for the misleading mistake. – user1840352 – 2017-04-27T13:46:22.147

Please don't fix errors by adding an addendum that says "the above is wrong; it should be X "; fix them *by fixing them* (in place). – Scott – 2017-04-27T19:06:02.030

Answers

1

sed -r 's/(\d{1,2})-([0-9]{2})-([0-9]{4})/\3-\2-\1/g' seems to do the trick, there might be some other "smarter" ways ... I don't know

my code:

x="abc,0,2,-2,3-16-1994#xyz,1,2,3,10-09-1994"
echo $x |tr '#' '\n'| sed -r 's/([0-9]{1,2})-([0-9]{2})-([0-9]{4})/\3-\2-\1/g'

produces:

abc,0,2,-2,1994-16-3
xyz,1,2,3,1994-09-10

2 "warnings" for your data ... is a little inconsistent (maybe it was just for the example's sake) ...

  • on the first line, the first element of the date is not "zero-padded"
  • after looking at the second line, it's not clear if the input data is dd-mm-yyyy or mm-dd-yyyy

Later edit: I've first missed the part that padding the day was a requirement, taken this in consideration seems that the following helps

echo $x |tr '#' '\n'| sed -r 's/\b([0-9])-([0-9]{2})-([0-9]{4})\b/\3-\2-0\1/g; s/\b([0-9]{2})-([0-9]{2})-([0-9]{4})\b/\3-\2-\1/g'

practically we have 2 regex one (the first) matches the "dates" where day has only 1 digit and adds the padding "0" when doing the substitution, the other one matches the "dates" that have day with 2 digits and does only the reordering of the elements

got the idea from here https://stackoverflow.com/questions/12129382/add-leading-0-in-sed-substitution, so kudos to the person who answered there

Lohmar ASHAR

Posted 2017-04-26T21:13:18.233

Reputation: 111

On the one hand, yeah, the question is flawed.  On the other hand, the question says, “sometimes the dates will have single digits for days,” and shows the example of 3 in the input being converted to 03 in the output. – Scott – 2017-04-27T01:50:49.527

it was (very) late in my timezone and I've missed that part of the "requirement", updated the answer to cover it – Lohmar ASHAR – 2017-04-27T08:50:44.633

Seems to work. If you add s/\b([0-9]{1,2}-)([1-9]-[0-9]{4})\b/\10\2/; to the beginning of the sed command string, it’ll handle single-digit months, too. – Scott – 2017-04-27T09:07:25.993

@LohmarASHAR Thanks, I tried your solution. First the "--r" give me an illegal argument issue. When removing that, I get this error "sed: 1: "s/\b([0-9])-([0-9]{2})- ...": \3 not defined in the RE" – user1840352 – 2017-04-27T13:54:05.443

there's only 1 dash ("-") before the "r" ... or maybe it's only a typo in your comment, anyway I am using bash WSL (ubuntu in windows ), sed --version says sed (GNU sed) 4.2.2, sed --help gives -r, --regexp-extended use extended regular expressions in the script.. On other servers (some debian and centos) I've found version 4.2.1, and it's still working so ... maybe it wasn't a typo in the comment – Lohmar ASHAR – 2017-04-27T16:26:03.747

0

This works when I test it, note that I've swapped day and month since the original format in your example is actually mm-dd-yyy. Not sure if the zero padding is important to you or not, haven't tried that:

sed -i -r 's|([[:digit:]]*)-([[:digit:]]*)-([[:digit:]]*)|\3-\1-\2|' test.csv

John C

Posted 2017-04-26T21:13:18.233

Reputation: 1 034