1
Imagine that there are 3 text files.
1.txt:
a
b
c
2.txt:
f
c
d
3.txt:
b
c
f
How do I sort them by frequency of each "line content"? (In case of collisions alphabetically)
Result:
c
b
f
a
d
1
Imagine that there are 3 text files.
1.txt:
a
b
c
2.txt:
f
c
d
3.txt:
b
c
f
How do I sort them by frequency of each "line content"? (In case of collisions alphabetically)
Result:
c
b
f
a
d
4
You can use sort
and uniq
to sort the lines by frequencies.
sort *.txt | uniq -c | sort -k1,1nr -k2 | sed 's/^ *[0-9]* //'
The second sort
uses the secondary -k2
to sort the lines of the same frequency alphabetically. The final sed
just removes the frequencies.
1
You can sort in descending order of frequency using sort and uniq:
$ sort *.txt | uniq -c | sort -rn
3 c
2 f
2 b
1 d
1 a
If you want to remove the count:
$ sort *.txt | uniq -c | sort -rn | sed 's/[[:space:]]*[[:digit:]]*[[:space:]]//'
c
f
b
d
a
Note that two calls to sort
are required. The first is because uniq -c
requires sorted input. The second is needed to sort the lines into descending numerical order by count (frequency).
Didn't test yet, but gonna accept and upvote for that alphabetical part included. Thanks. – Samuel Shifterovich – 2016-07-06T23:26:08.847
1No worries, I've tested it before posting :-) – choroba – 2016-07-06T23:32:18.043