Print every line but conditionally only fragments for some lines

1

I have a text file that looks like this:

   rno-miR-344-5p
   miRPlus_11239/mmu-miR-383/rno-miR-383
   hsa-miR-301a/mmu-miR-301a/rno-miR-301a
   hsa-miR-199a-3p/hsa-miR-199b-3p/mmu-miR-199a-3p/mmu-miR-199b/rno-miR-199a-3p
   Empty
   Hy3
   rno-miR-1

   rno-miR-598-5p
   spike_control_h

   Empty

I would like to print every line. BUT for those lines that contain e.g. hsa-miR-301a/mmu-miR-301a/rno-miR-301a I would like to print only the rno-miR-etc part.

I've been trying to do this with awk but I out of my depth.

duff

Posted 2014-05-19T17:15:28.427

Reputation: 385

Answers

2

assuming slash only occurs on the lines you are targetting:

awk -F/ '{print $NF}' file

will print only the last slash-separated field. On lines with no slash, field 1 is also the last field.

If you are specifically targetting lines starting with hsa-miR, then:

awk -F/ '/^hsa-miR/ {print $NF; next} {print}' file

glenn jackman

Posted 2014-05-19T17:15:28.427

Reputation: 18 546

Great, you're first solution is perfect. Could you please explain it? – duff – 2014-05-19T17:26:04.507

I thought I did. The -F option defines the field separator. The NF variable is the number of fields. The $ operator refers to the value of the given field number. – glenn jackman – 2014-05-19T17:40:05.683

Indeed you did. After posting I thought 'How lazy I am!' and went a looked up what NF did. Thanks again. – duff – 2014-05-20T19:50:00.517