4

I have a command (cmd1) that greps through a log file to filter out a set of numbers. The numbers are in random order, so I use sort -gr to get a reverse sorted list of numbers. There may be duplicates within this sorted list. I need to find the count for each unique number in that list.

For example, if the output of cmd1 is

100 100 100 99 99 26 25 24 24

I need another command that I can pipe the above output to, so that I get :

100 3 99 2 26 1 25 1 24 2
splattne
  • 28,348
  • 19
  • 97
  • 147
letronje
  • 429
  • 1
  • 6
  • 16
  • related: http://stackoverflow.com/questions/1092405/counting-duplicates-in-a-sorted-sequence-using-command-line-tools – David Cary Jun 24 '12 at 17:24

4 Answers4

15

If you can handle the output being in a slightly different format, you could do:

cmd1 | tr " " "\n" | uniq -c

You'd get back:

  3 100
  2 99
  1 26
  1 25
  2 24
Evan Anderson
  • 141,071
  • 19
  • 191
  • 328
1

Also add in the -u switch. Thus you would have:

cmd1 | sort -gru

From the sort manpage:

-u, --unique
without -c, output only the first of an equal run
Kevin M
  • 2,302
  • 1
  • 16
  • 21
0

(I'm assuming your input is one number per line, as that's what sort would output.)

You could try awk:

<your_command> | awk '{numbers[$1]++} END {for (number in numbers) print number " " numbers[number]}'

This would give you an un-sorted list (the order in which arrays are walked through in awk are undefined, so far as I know), so you'd have to sort to your liking again.

Geoff Fritz
  • 1,717
  • 9
  • 11
0
$ echo '100 100 100 99 99 26 25 24 24' | perl -e 'while (<>) { chomp; my %nums; foreach (split(/ /)) { $nums{$_} += 1; }; foreach (sort {$b <=> $a} keys %nums) { print "$_ $nums{$_} " }; print "\n"; }'
100 3 99 2 26 1 25 1 24 2
towo
  • 1,887
  • 14
  • 12