Histogram generation

12

1

Write the shortest program that generates a histogram (a graphical representation of the distribution of data).

Rules:

  • Must generate a histogram based on the character length of the words (punctuation included) input into the program. (If a word is 4 letters long, the bar representing the number 4 increases by 1)
  • Must display bar labels that correlate with the character length the bars represent.
  • All characters must be accepted.
  • If the bars must be scaled, there needs to be some way that is shown in the histogram.

Examples:

$ ./histogram This is a hole in one!
1 |#
2 |##
3 |
4 |###

$./histogram Extensive word length should not be very problematic.
1 |
2 |#
3 |#
4 |##
5 |
6 |##
7 |
8 |
9 |#
10|
11|
12|#

./histogram Very long strings of words should be just as easy to generate a histogram just as short strings of words are easy to generate a histogram for.
1 |##
2 |#######
3 |#
4 |#######
5 |###
6 |#
7 |##
8 |##
9 |##

syb0rg

Posted 2013-12-11T00:21:48.707

Reputation: 1 080

4Please write a specification rather than giving a single example which, solely by virtue of being a single example, cannot express the range of acceptable output styles, and which doesn't guarantee to cover all corner cases. It's good to have a few test cases, but it's even more important to have a good spec. – Peter Taylor – 2013-12-11T00:37:13.890

@PeterTaylor More examples given. – syb0rg – 2013-12-11T02:19:04.710

1>

  • This is tagged [tag:graphical-output], which means that it's about drawing on the screen or creating an image file, but your examples are [tag:ascii-art]. Is either acceptable? (If not then plannabus might not be happy). 2. You define punctuation as forming countable characters in a word, but you don't state which characters separate words, which characters may and may not occur in the input, and how to handle characters which may occur but which are not alphabetic, punctuation, or word separators. 3. Is it acceptable, required, or prohibited to rescale the bars to fit in a sensible size?
  • < – Peter Taylor – 2013-12-11T08:45:39.780

    @PeterTaylor I didn't tag it ascii-art, because it really isn't "art". Phannabus's solution is just fine. – syb0rg – 2013-12-11T13:19:50.843

    @PeterTaylor I've added in some rules based on what you described. So far, all the solutions here adhere to all of the rules still. – syb0rg – 2013-12-11T13:45:33.020

    Answers

    3

    K, 35

    {(1+!|/g)#`$(#:'=g:#:'" "\:x)#'"#"}
    

    .

    k){(1+!|/g)#`$(#:'=g:#:'" "\:x)#'"#"}"Very long strings of words should be just as easy to generate a histogram just as short strings of words are easy to generate a histogram for."
    1| ##
    2| #######
    3| #
    4| #######
    5| ###
    6| #
    7| ##
    8| ##
    9| ##
    

    .

    A longer example

    k){(1+!|/g)#`$(#:'=g:#:'" "\:x)#'"#"}"Please write a specification rather than giving a single example which, solely by virtue of being a single example, cannot express the range of acceptable output styles, and which doesnt guarantee to cover all corner cases. Its good to have a few test cases, but its even more important to have a good spec."
    1 | #####
    2 | ######
    3 | #######
    4 | ########
    5 | ######
    6 | ##############
    7 | ###
    8 | #
    9 | ##
    10| #
    11|
    12|
    13| #
    

    tmartin

    Posted 2013-12-11T00:21:48.707

    Reputation: 3 917

    What happens if there are words with more than 9 letters? – None – 2013-12-11T23:49:18.707

    It works for words of any length – tmartin – 2013-12-12T10:14:30.730

    5

    Python - 83 characters

    Seems that we can take input from anywhere, so this takes input during execution, rather than from the command line, and uses Ejrb's suggestion to shorten it by 8.

    s=map(len,raw_input().split())
    c=0;exec'c+=1;print"%3d|"%c+"#"*s.count(c);'*max(s)
    

    Python - 91 characters

    This will fall over with quotes.

    import sys;s=map(len,sys.argv[1:])
    for i in range(1,max(s)+1):print"%3d|"%i+'#'*s.count(i)
    

    Input:

    > python hist.py Please write a specification rather than giving a single example which, solely by virtue of being a single example, cannot express the range of acceptable output styles, and which doesnt guarantee to cover all corner cases. Its good to have a few test cases, but its even more important to have a good spec.
    

    Output:

      1|#####
      2|######
      3|#####
      4|##########
      5|######
      6|#############
      7|####
      8|#
      9|##
     10|#
     11|
     12|
     13|#
    

    user8777

    Posted 2013-12-11T00:21:48.707

    Reputation:

    2nice, you can shave off 4 chars by reworking your second line (no algorithm change) to use exec and string concatenation: c=0;exec'c+=1;print"%3d|"%c+"#"*s.count(c);'*max(s) – ejrb – 2013-12-11T15:12:27.587

    5

    R, 55 47 characters

    hist(a<-sapply(scan(,""),nchar),br=.5+0:max(a))
    

    Luckily R comes with a plot function hist for histograms, here supplied with a breaks argument where the breaks are 0.5, 1.5, ... until max(input)+0.5. sapply(scan(,""),nchar) takes an input (as stdin), separates it following the spaces and count the number of characters of each element.

    Examples:

    hist(a<-sapply(scan(,""),nchar),br=.5+0:max(a))
    1: Extensive word length should not be very problematic.
    9: 
    Read 8 items
    

    enter image description here

    hist(a<-sapply(scan(,""),nchar),br=.5+0:max(a))
    1: Very long strings of words should be just as easy to generate a histogram just as short strings of words are easy to generate a histogram for.
    28: 
    Read 27 items
    

    enter image description here

    Edit:

    A variation at 71 characters with an axis label at each possible value:
    hist(a<-sapply(scan(,""),nchar),br=.5+0:max(a),ax=F);axis(1,at=1:max(a))
    

    enter image description here

    plannapus

    Posted 2013-12-11T00:21:48.707

    Reputation: 8 610

    3I love when a normally verbose language takes the lead! – None – 2013-12-11T10:06:24.263

    This doesn't comply with the specification, however... – Doorknob – 2013-12-11T13:02:29.397

    @Doorknob which specification doesn't it comply with? – plannapus – 2013-12-11T13:06:19.510

    The example testcases. – Doorknob – 2013-12-11T13:07:34.333

    3They are examples, not specifications... – plannapus – 2013-12-11T13:08:36.333

    4

    Haskell - 126 characters

    p[d]=[' ',d];p n=n
    h l=[1..maximum l]>>= \i->p(show i)++'|':(l>>=($"#").drop.abs.(i-))++"\n"
    main=interact$h.map length.words
    

    This takes the input from stdin, not the command line:

    & head -500 /usr/share/dict/words | runhaskell 15791-Histogram.hs 
     1|##
     2|##
     3|######
     4|###############
     5|################################################
     6|###############################################################
     7|###################################################################
     8|###########################################################################
     9|#############################################################
    10|##########################################################
    11|#########################################################
    12|#########################
    13|#######
    14|###
    15|#####
    16|###
    17|#
    18|
    19|#
    20|#
    

    MtnViewMark

    Posted 2013-12-11T00:21:48.707

    Reputation: 4 779

    Looks good to me! +1 – syb0rg – 2013-12-11T13:16:30.357

    3

    Perl, 56

    $d[y///c].='#'for@ARGV;printf"%2d|$d[$_]
    ",$_ for+1..$#d
    

    Added @manatwork's rewrite and literal newline suggestion, thank you very much! Added @chinese_perl_goth's updates.

    Usage: save as hist.pl and run perl hist.pl This is a test

    Example output:

    $perl ~/hist.pl This is a test of the histogram function and how it will count the number of words of specific lengths. This sentence contains a long word 'complexity'.
     1|##
     2|#####
     3|####
     4|######
     5|##
     6|#
     7|
     8|#####
     9|#
    10|
    11|#
    

    Dom Hastings

    Posted 2013-12-11T00:21:48.707

    Reputation: 16 415

    1

    Why not use printf? You could spare some characters on formatting. And some more by changing from hash to array: $d[y///c]++for@ARGV;shift@d;printf"%2d|%s\n",++$i,"#"x$_ for@d.

    – manatwork – 2013-12-11T13:29:36.697

    Can I see an example of this program at work? – syb0rg – 2013-12-11T13:48:43.097

    @manatwork printf didn't occur to me at all and for some reason I didn't think I could get the effect I wanted with an array, amazing! @syb0rg adding now – Dom Hastings – 2013-12-11T14:01:22.020

    2golfed some more, got it down to 57 bytes: $d[y///c].='#'for@ARGV;printf"%2d|$d[$_]\n",$_ for+1..$#d – chinese perl goth – 2013-12-12T21:20:48.503

    @chineseperlgoth thank you! Updated the post! Not sure why building the string directly didn't occur to me! – Dom Hastings – 2013-12-13T07:05:35.770

    1

    We missed just the simplest trick: use a literal newline instead of \n to spare 1 more character. I mean like this: http://pastebin.com/496z2a0n

    – manatwork – 2013-12-13T09:42:08.277

    3

    Python 3.3 (93)

    a=[len(i) for i in input().split()]
    for i in range(1,max(a)+1):
     print(i,'|',"#"*a.count(i))
    

    Output:
    (the first line is the input string)

    Very long strings of words should be just as easy to generate a histogram just as short strings of words are easy to generate a histogram for.
    1 | ##
    2 | #######
    3 | #
    4 | #######
    5 | ###
    6 | #
    7 | ##
    8 | ##
    9 | ##
    

    It doesn't justify numbers as Lego Stormtroopr's Python solution (which is also shorter than mine), but it's my first entry ever in a golfing contest, so I might as well leave it here I guess :)

    Roberto

    Posted 2013-12-11T00:21:48.707

    Reputation: 761

    Could you edit in an example of a generated histogram by this program? – syb0rg – 2013-12-12T00:05:28.833

    Yes, but I just noticed it has one problem: it doesn't justify the numbers as Lego Stormtroopr's solution, so I'm actually thinking about retiring the answer. – Roberto – 2013-12-12T00:07:10.243

    As long as there are labels for the represented bars, the answer is acceptable. – syb0rg – 2013-12-12T00:08:03.160

    Ok, done then! :) – Roberto – 2013-12-12T00:12:46.250

    This takes input from input, not from arguments. Is this valid @syb0rg ? – None – 2013-12-12T22:36:49.660

    @LegoStormtroopr Yes, as long as the histogram is accurate, any input is acceptable. – syb0rg – 2013-12-12T22:38:18.713

    3

    J, 48 47 46 45 43 characters

    (;#&'#')/"1|:((],[:+/=/)1+[:i.>./)$;._1' ',
    

    Usage:

       (;#&'#')/"1|:((],[:+/=/)1+[:i.>./)$;._1' ','Very long strings of words should be just as easy to generate a histogram just as short strings of words are easy to generate a histogram for.'
    ┌─┬───────┐
    │1│##     │
    ├─┼───────┤
    │2│#######│  
    ├─┼───────┤
    │3│#      │
    ├─┼───────┤
    │4│#######│
    ├─┼───────┤
    │5│###    │
    ├─┼───────┤
    │6│#      │
    ├─┼───────┤
    │7│##     │
    ├─┼───────┤
    │8│##     │
    ├─┼───────┤
    │9│##     │
    └─┴───────┘
    

    Gareth

    Posted 2013-12-11T00:21:48.707

    Reputation: 11 678

    Tacit, 38 [:((](;#&'#')"0[:+/=/)1+[:i.>./)#@>@;: : Try it online!

    – Jonah – 2019-12-22T07:45:11.653

    2

    Ruby, 98 85

    a=$*.group_by &:size
    1.upto(a.max[0]){|i|b=a.assoc i
    puts"%-2i|#{b&&?#*b[1].size}"%i}
    

    Not golfed much. Will golf more later.

    c:\a\ruby>hist This is a test for the histogram thingy. yaaaaaaaaaaaay
    1 |#
    2 |#
    3 |##
    4 |##
    5 |
    6 |
    7 |#
    8 |
    9 |#
    10|
    11|
    12|
    13|
    14|#
    

    Doorknob

    Posted 2013-12-11T00:21:48.707

    Reputation: 68 138

    Works nicely (++voteCount). Anything I could do to word the question better? – syb0rg – 2013-12-11T02:59:29.663

    1@syb0rg IMO the question is worded fine, the examples speak for themselves. Although your last once seems to have an error... I count 2 8-letter words (generate and generate) and 2 9-letter words (histogram, histogram) – Doorknob – 2013-12-11T03:00:11.660

    Cool. You could change b ?(?#*b[1].size):'' with b&&?#*b[1].size. – manatwork – 2013-12-11T08:46:10.953

    2

    Powershell, 97 93

    $a=@{};$args-split ' '|%{$a[$_.length]++};1..($a.Keys|sort)[-1]|%{"{0,-2} |"-f $_+"#"*$a[$_]}
    

    Example:

    PS Z:\> .\hist.ps1 This is an example of this program running
    1  |
    2  |###
    3  |
    4  |##
    5  |
    6  |
    7  |###
    

    Danko Durbić

    Posted 2013-12-11T00:21:48.707

    Reputation: 10 241

    Can I see an example of this program running? – syb0rg – 2013-12-11T13:48:22.597

    @syb0rg Sure, I've updated the answer with an example. – Danko Durbić – 2013-12-11T13:59:49.543

    Looks good! +1 to you! – syb0rg – 2013-12-11T14:00:26.667

    Nice. You could remove extra spaces and save 6 bytes $a=@{};-split$args|%{$a[$_.length]++};1..($a.Keys|sort)[-1]|%{"{0,-2}|"-f$_+"#"*$a[$_]} – mazzy – 2018-09-27T08:02:29.773

    2

    APL (42)

    ⎕ML←3⋄K,⊃⍴∘'▓'¨+⌿M∘.=K←⍳⌈/M←↑∘⍴¨I⊂⍨' '≠I←⍞
    

    Could be shorter if I could omit lines where the value is 0.

    Explanation:

    • ⎕ML←3: set the migration level to 3 (this makes (partition) more useful).
    • I⊂⍨' '≠I←⍞: read input, split on spaces
    • M←↑∘⍴¨: get the size of the first dimension of each item (word lengths), and store in M
    • K←⍳⌈/M: get the numbers from 1 to to the highest value in M, store in K
    • +⌿K∘.=M: for each value in M, see how many times it is contained in K.
    • ⊃⍴∘'▓'¨: for each value in that, get a list of that many s, and format it as a matrix.
    • K,: prepend each value in K to each row in the matrix, giving the labels.

    Output:

          ⎕ML←3⋄K,⊃⍴∘'▓'¨+⌿M∘.=K←⍳⌈/M←↑∘⍴¨I⊂⍨' '≠I←⍞
    This is a hole in one!
    1 ▓  
    2 ▓▓ 
    3    
    4 ▓▓▓
          ⎕ML←3⋄K,⊃⍴∘'▓'¨+⌿M∘.=K←⍳⌈/M←↑∘⍴¨I⊂⍨' '≠I←⍞
    Very long strings of words should be just as easy to generate a histogram just as short strings of words are easy to generate a histogram for.
    1 ▓▓     
    2 ▓▓▓▓▓▓▓
    3 ▓      
    4 ▓▓▓▓▓▓▓
    5 ▓▓▓    
    6 ▓      
    7 ▓▓     
    8 ▓▓     
    9 ▓▓     
    

    marinus

    Posted 2013-12-11T00:21:48.707

    Reputation: 30 224

    2

    Mathematica 97

    Histogram["" <> # & /@ StringCases[StringSplit[InputString[]], WordCharacter] /. 
    a_String :> StringLength@a]
    

    When I input text of the Declaration of Independence as a single string (through cut and paste, of course), the output generated was:

    declaration of independence

    DavidC

    Posted 2013-12-11T00:21:48.707

    Reputation: 24 524

    2

    Ruby, 79

    (1..(w=$*.group_by &:size).max[0]).map{|i|puts"%2i|#{?#*w.fetch(i,[]).size}"%i}
    

    Example run:

    $ ruby hist.rb Histograms, histograms, every where, nor any drop to drink.
     1|
     2|#
     3|##
     4|#
     5|#
     6|##
     7|
     8|
     9|
    10|
    11|##
    

    Please see my Forth submission for a laugh.

    Darren Stone

    Posted 2013-12-11T00:21:48.707

    Reputation: 5 072

    2

    Forth, 201

    This was fun but my Ruby submission is more competitive. ;-)

    variable w 99 cells allot w 99 cells erase : h begin
    1 w next-arg ?dup while swap drop dup w @ > if dup w
    ! then cells + +! repeat w @ 1+ 1 ?do i . 124 emit i
    cells w + @ 0 ?do 35 emit loop cr loop ; h
    

    Sample run:

    $ gforth histo.fth Forth words make for tough golfing!
    1 |
    2 |
    3 |#
    4 |#
    5 |###
    6 |
    7 |
    8 |#
    

    Max word length is 99.

    Darren Stone

    Posted 2013-12-11T00:21:48.707

    Reputation: 5 072

    2

    Ruby 1.8.7, 74

    A slightly different take than the other ruby solutions:

    i=0;$*.map{|v|v.size}.sort.map{|v|$><<(i+1..v).map{|n|"
    %2i:"%i=n}+['#']}
    

    output:

    ruby hist.rb `head -400 /usr/share/dict/words`
    
     1:#
     2:###
     3:######
     4:#############################
     5:#####################################################
     6:############################################################
     7:########################################################################
     8:######################################################
     9:############################################################
    10:########################
    11:###########################
    12:######
    13:#####
    

    AShelly

    Posted 2013-12-11T00:21:48.707

    Reputation: 4 281

    I didn't see this submission initially, sorry! +1 – syb0rg – 2014-01-09T21:37:35.913

    1

    JavaScript (159 133)

    Definitely not competitive, but so far the only JavaScript solution. Thanks to @manatwork for the tip on using String.replace.

    prompt(o=[]).replace(/\S+/g,function(p){o[l=p.length]=(o[l]||'')+'#'});for(i=1;i<o.length;)console.log(i+(i>9?"|":" |")+(o[i++]||""))
    

    Input

    Code Golf is a question and answer site for programming puzzle enthusiasts and code golfers. It's built and run by you as part of the Stack Exchange network of Q&A sites. With your help, we're working together to build a library of programming puzzles and their solutions.

    Output

    1 |##
    2 |#######
    3 |#########
    4 |########
    5 |######
    6 |###
    7 |####
    8 |####
    9 |
    10|#
    11|###
    

    quietmint

    Posted 2013-12-11T00:21:48.707

    Reputation: 204

    1Indeed, this is not really a field where JavaScript excels. But with replace() instead of split()+for and Array instead of Object+separate length variable can be reduced with a few characters: prompt(o=[]).replace(/\S+/g,function(p){o[l=p.length]=(o[l]||"")+"#"});for(i=1;i<o.length;)console.log(i+(i>9?"|":" |")+(o[i++]||"")). (And even shorter in Harmony: prompt(o=[]).replace(/\S+/g,p=>o[l=p.length]=(o[l]||"")+"#");for(i=1;i<o.length;)console.log(i+(i>9?"|":" |")+(o[i++]||"")).) – manatwork – 2013-12-15T17:03:58.367

    @manatwork Nice abuse of .length there. – quietmint – 2013-12-15T17:24:30.940

    1

    8th, 162 bytes

    Code

    a:new ( args s:len nip tuck a:@ ( 0 ) execnull rot swap n:1+ a:! ) 0 argc n:1- loop 
    a:len n:1- ( dup . "|" . a:@ ( 0 ) execnull "#" swap s:* . cr ) 1 rot loop bye
    

    Usage

    $ 8th histogram.8th Nel mezzo del cammin di nostra vita mi ritrovai per una selva oscura
    

    Output

    1|
    2|##
    3|####
    4|#
    5|##
    6|###
    7|
    8|#
    

    Ungolfed code (SED is Stack Effect Diagram)

    a:new               \ create an empty array 
    ( 
        args s:len      \ length of each argument
                        \ SED: array argument lengthOfArgument
        nip             \ SED: array lengthOfArgument
        tuck            \ SED: lengthOfArgument array lengthOfArgument
        a:@             \ get item array at "lengthOfArgument" position
        ( 0 ) execnull  \ if null put 0 on TOS
                        \ SED: lengthOfArgument array itemOfArray
        rot             \ SED: array itemOfArray lengthOfArgument    
        swap            \ SED: array lengthOfArgument itemOfArray
        n:1+            \ increment counter for the matching length
        a:!             \ store updated counter into array 
    ) 0 argc n:1- loop  \ loop through each argument
    \ print histogram
    a:len n:1- ( dup . "|" . a:@ ( 0 ) execnull "#" swap s:* . cr ) 1 rot loop 
    bye                 \ exit program
    

    Chaos Manor

    Posted 2013-12-11T00:21:48.707

    Reputation: 521

    1

    Pure bash 120

    d="$@"
    d=${d//[ -z]/#}
    for a;do((b[${#a}]++));done
    e="${!b[@]}"
    for((i=1;i<=${e##* };i++));do
    echo $i\|${d:0:b[i]}
    done
    

    Sample:

    ./histogram.sh Very long strings of words should be just as easy to generate a histogram just as short strings of words are easy to generate a histogram for.
    1|##
    2|#######
    3|#
    4|#######
    5|###
    6|#
    7|##
    8|##
    9|##
    

    Save 8 chars by using one fork to tr: 112

    for a;do((b[${#a}]++));done
    e="${!b[@]}"
    for((i=1;i<=${e##* };i++));do
    printf "%d|%${b[i]}s\n" $i
    done|tr \  \#
    

    Give same result:

    bash -c 'for a;do((b[${#a}]++));done;e="${!b[@]}";for((i=1;i<=${e##* };i++));
    do printf "%d|%${b[i]}s\n" $i;done|tr \  \#' -- $( sed 's/<[^>]*>//g;
    s/<[^>]*$//;s/^[^<]*>//' < /usr/share/scribus/loremipsum/english.xml )
    

    render (on my host:)

    1|############################################################
    2|#################################################################################################################################################################################################################
    3|####################################################################################################################################################################################################################################################
    4|####################################################################################################################################################################################################
    5|####################################################################################################################################################################
    6|#######################################################################################
    7|##########################################################################################
    8|###################################################
    9|###############################
    10|####################
    11|#########
    12|############
    13|#####
    14|####
    15|##
    16|
    17|
    18|
    19|
    20|
    21|
    22|
    23|
    24|
    25|
    26|
    27|
    28|
    29|
    30|
    31|
    32|
    33|
    34|#
    

    F. Hauri

    Posted 2013-12-11T00:21:48.707

    Reputation: 2 654

    1

    PHP, 162

    <?php error_reporting(0);$b=0;while($argv[$b])$c[strlen($argv[++$b])]++;for($t=1;$t<=max(array_keys($c));$t++)echo $t.'|'.($c[$t]?str_repeat('#',$c[$t]):'')."\n";
    

    Usage:

    php histogram.php Very long strings of words should be just as easy to generate a histogram just as short strings of words are easy to generate a histogram for.
    1|##
    2|#######
    3|#
    4|#######
    5|###
    6|#
    7|##
    8|##
    9|##
    

    Piotr Kepka

    Posted 2013-12-11T00:21:48.707

    Reputation: 11