Bash - sort by not first character

1

1

I want to sort my file by first column but I have to start sort from 5 character. How can I do that?

My file:

"TTTTCTTACA"            1       1
"TTTTCTTACC"                    1
"TTTTCTTACT"    1       1
"TTTTCTTAGC"    1
"TTTTCTTATT"                    2
"TTTTCTTCAA"    1               1       1
"TTTTCTTCAG"    1               2       1
"TTTTCTTCAT"            1       2       2
"TTTTCTTCCT"                            2
"TTTTCTTCGG"                    2       2
"TTTTCTTCTA"                            1
"TTTTCTTCTG"            1
"TTTTCTTCTT"    1                       2
"TTTTCTTGAA"            1
"TTTTCTTGCT"    1               1       1
"TTTTCTTTAA"    1
"TTTTCTTTAG"            1       1
"TTTTCTTTCT"    1
"TTTTCTTTGC"    1
"TTTTCTTTGG"            1       1
"TTTTCTTTGT"    1       1       2       1
"TTTTCTTTTA"    1

I was trying:

sort -k1,1 file | uniq -s 6 -w 5 

Of course, it doesn't work. Mayby sort has some flags, but I didn't find them. Do you have some idea?

diego9403

Posted 2016-06-22T15:08:35.967

Reputation: 807

"I want to sort my file by first column" - Your data is already sorted by the first column. Please explain what you are really trying to do. – DavidPostill – 2016-06-25T13:08:08.367

Answers

3

tl;dr

sort -k1.5 file | uniq -s 6 -w 5


Explanation

My sort is GNU coreutils 8.22. The manpage for my sort shows:

KEYDEF is F[.C][OPTS][,F[.C][OPTS]] for start and stop position, where F is a field number and  C
       a  character  position  in  the  field;  both are origin 1, and the stop position defaults to the
       line's end.

So with your current sort command, sort -k1,1 file uses the first word to the first word as the sort.

What you want is (for the sort command anyway):

sort -k1.5 file | uniq -s 6 -w 5

This will use the fifth character of the first word, which is what you wanted.

bgStack15

Posted 2016-06-22T15:08:35.967

Reputation: 1 644

0

$sort -k2 file

"TTTTCTTCTA"                            1
"TTTTCTTCCT"                            2
"TTTTCTTACC"                    1
"TTTTCTTATT"                    2
"TTTTCTTCGG"                    2       2
"TTTTCTTCTG"            1
"TTTTCTTGAA"            1
"TTTTCTTACA"            1       1
"TTTTCTTTAG"            1       1
"TTTTCTTTGG"            1       1
"TTTTCTTCAT"            1       2       2
"TTTTCTTAGC"    1
"TTTTCTTTAA"    1
"TTTTCTTTCT"    1
"TTTTCTTTGC"    1
"TTTTCTTTTA"    1
"TTTTCTTCTT"    1                       2
"TTTTCTTCAA"    1               1       1
"TTTTCTTGCT"    1               1       1
"TTTTCTTCAG"    1               2       1
"TTTTCTTACT"    1       1
"TTTTCTTTGT"    1       1       2       1

$sort -k2 file | uniq -f 1

"TTTTCTTCTA"                            1
"TTTTCTTCCT"                            2
"TTTTCTTACC"                    1
"TTTTCTTATT"                    2
"TTTTCTTCGG"                    2       2
"TTTTCTTCTG"            1
"TTTTCTTACA"            1       1
"TTTTCTTCAT"            1       2       2
"TTTTCTTAGC"    1
"TTTTCTTCTT"    1                       2
"TTTTCTTCAA"    1               1       1
"TTTTCTTCAG"    1               2       1
"TTTTCTTACT"    1       1
"TTTTCTTTGT"    1       1       2       1

Buffalo Rabor

Posted 2016-06-22T15:08:35.967

Reputation: 11

the literal first column is already sorted with no duplication in your sample data, so I provided a sort of the first numeric column. – Buffalo Rabor – 2016-06-22T18:24:34.567