Alphabet Histogram

33

Given an input sentence consisting of one or more words [a-z]+ and zero or more spaces , output an ASCII-art histogram (bar graph) of the letter distribution of the input sentence.

The histogram must be laid out horizontally, i.e. with the letter key along the bottom in alphabetical order from left to right, with a Y-axis labeled 1- and every 5 units. The Y-axis must be the smallest multiple of five that is at least as tall as the tallest bar, and must be right-aligned. The X-axis is labeled with the input letters, with no gaps between. For example, input a bb dd should have label abd and not ab d, skipping the c. The bars themselves can be made of any consistent ASCII character -- I'll be using X here in my examples.

test example

5-

   X
   X   X
1-XXXXXXXX
  aelmpstx

Since there are three e, two t, and one of almsx.

More examples:

the quick brown fox jumped over the lazy dogs

5-
      X         X
      X         X
     XX  X      X  X XX
1-XXXXXXXXXXXXXXXXXXXXXXXXXX
  abcdefghijklmnopqrstuvwxyz


now is the time for all good men to come to the aid of their country

10-
              X
              X
              X  X
      X       X  X
 5-   X       X  X
      X   X   X  X
      X  XX XXXX X
   XXXXX XXXXXXX X
 1-XXXXXXXXXXXXXXXXXX
   acdefghilmnorstuwy

a bb ccc dddddddddddd

15-


      X
      X
10-   X
      X
      X
      X
      X
 5-   X
      X
     XX
    XXX
 1-XXXX
   abcd

a bb ccccc

5-  X
    X
    X
   XX
1-XXX
  abc

I/O and Rules

  • Input can be taken in any reasonable format and by any convenient method. This also means you can take input in all-uppercase, if that makes more sense for your code.
  • Leading/trailing newlines or other whitespace are optional, provided that the characters line up appropriately.
  • Either a full program or a function are acceptable. If a function, you can return the output rather than printing it.
  • Output can be to the console, returned as a list of strings, returned as a single string, etc.
  • Standard loopholes are forbidden.
  • This is so all usual golfing rules apply, and the shortest code (in bytes) wins.

AdmBorkBork

Posted 8 years ago

Reputation: 41 581

3I think this would be a bar graph rather than a histogram, as it's categorical rather than numeric data, but I'm mostly being pedantic. – Giuseppe – 8 years ago

is the input guaranteed to be non-empty? – dzaima – 8 years ago

@dzaima Yes, the input is guaranteed non-empty. There will be at least one word. – AdmBorkBork – 8 years ago

To confirm, we can't go up to, say, 15 lines in the first example and include y axis markers for 10 and 15, right? – dylnan – 8 years ago

@dylnan Right. The Y-axis should only be large enough to contain the data. – AdmBorkBork – 8 years ago

@dylnan No, not necessarily. The input could consist of only one word without any spaces. – AdmBorkBork – 8 years ago

2

Just being a pendant, but this isn't a histogram, it's a bar chart. Still a nice challenge though!

– caird coinheringaahing – 8 years ago

"Leading/trailing newlines or other whitespace are optional, provided that the characters line up appropriately." -- so columns of whitespace are acceptable? – Jonathan Allan – 8 years ago

@JonathanAllan If you're asking if you could do something like abc e for the columns, no that's not okay. If you have a whole leading column of whitespace (e.g., a leading space on every line), that's fine. – AdmBorkBork – 8 years ago

I was asking for a d e h style rather than a de h – Jonathan Allan – 8 years ago

@JonathanAllan The letters must be adjacent. I'll make that explicitly clear. – AdmBorkBork – 8 years ago

Maybe add test cases that test extension of the Y-axis at the right points, i.e. with 10 and with 11? – Adám – 8 years ago

Can the y-axis labels be left-aligned? – Magic Octopus Urn – 8 years ago

@MagicOctopusUrn Nope, right-aligned. That's just the rules. ;-) – AdmBorkBork – 8 years ago

Does the X axis need to be sorted? – Erik the Outgolfer – 8 years ago

4A Tuftian approach would be to make the bars out of the characters represented and not have a separate label row. – dmckee --- ex-moderator kitten – 8 years ago

@dmckee For some cases that could be golfier too! :P – Erik the Outgolfer – 8 years ago

Related, but only in the output style. – FryAmTheEggman – 8 years ago

2The histogram character has to be consistent, but across cases or within each case? – Adám – 8 years ago

@Adám It needs only be consistent within the same case. If it's different on subsequent runs, or different inputs, that's fine. I just don't want a mishmash of characters so you can't understand the graph. – AdmBorkBork – 8 years ago

a quick brown fox jumpest over a lazy dog has fewer extra letters. – mbomb007 – 8 years ago

Answers

7

R, 239 230 bytes

K=table(el(strsplit(gsub(" ","",scan(,"")),"")));m=matrix(" ",L<-sum(K|1)+1,M<-(M=max(K))+-M%%5+1);m[2:L,M]=names(K);m[1,M-g]=paste0(g<-c(1,seq(5,M,5)),"-");m[1,]=format(m[1,],j="r");for(X in 2:L)m[X,M-1:K[X-1]]=0;write(m,1,L,,"")

Try it online!

table does the heavy lifting here, uniquifying the characters, sorting them, and returning their counts.

Everything else is just ensuring the offsets are right for printing, which is the "real" work of an ascii-art challenge.

Thanks to @dylnan for pointing out a bug.

Thanks to @rturnbull for the scan approach, dropping 2 bytes.

Giuseppe

Posted 8 years ago

Reputation: 21 077

237 bytes – rturnbull – 8 years ago

@rturnbull I managed to knock off a few more bytes after that, thanks! – Giuseppe – 8 years ago

7

Stax, 37 bytes

ü╣úwóΓµ┐Wh0íJ▌Ñìs┤►ï╖öz<à↔/Ü@τ|:╢Ω$‼φ

Run and debug it

recursive

Posted 8 years ago

Reputation: 8 616

6

gnu sed -r, 516 490 278 249 + 1 bytes

s/$/:ABCDEFGHIJKLMNOPQRSTUVWXYZ /
:a
s/(.)(:.*\1)/\2\1/I
ta
s/[A-Z]+/ /g
h
z
:b
s/ \w/ /g
G
s/:/&I/g
/:II{5}+ *$/M!bb
s/[a-z]/X/g
G
s/:(I{5}+|I)\b/0\1-/g
s/:I*/  /g
s/ (\w)\1*/\1/g
s/$/; 10123456789I0/
:c
s/(.)I(.*\1(I?.))|;.*/\3\2/
/\nI/s/^/ /Mg
tc

Try it online!


I am sure this can be improved, but for now, this should be good considering it is made in sed, where you don't have native arithmetic or sorting. So I lied, this wasn't good enough, so I improved (rewrote) it by another 212 bytes, with a tip regarding the sorting algorithm from Cows quack, which gave me a idea to make the unary to decimal conversion shorter too.
Description of inner workings:

s/$/:ABCDEFGHIJKLMNOPQRSTUVWXYZ /
:a
s/(.)(:.*\1)/\2\1/I
ta
s/[A-Z]+/ /g
h
z

This sorts the input and separates the groups with spaces. This works by first appending an uppercase alphabet plus space separated by a colon to the end. Then it moves each character in front of the colon to a matching character behind the colon using a case-insensitive substitution in a loop. The uppercase letters are then replaced by spaces and the string is copied to the holding space.

:b
s/ \w/ /g
G
s/:/&I/g
/:II{5}+ *$/M!bb

This loop works by reducing each character group size by one, appending the sorted original line and incrementing unary counters after the colon that remained from the sorting. It loops until an empty line with a number of 5*n + 1 is reached (since the last line ultimately results in whitespace). The pattern space looks something like this after the loop:

:IIIIII           
:IIIII           
:IIII           
:III  e         
:II  ee     t    
:I a eee l m p s tt x   

Then the formatting follows:

s/[a-z]/X/g            # makes bars consistent
G                      # appends line that becomes x-axis
s/:(I{5}+|I)\b/0\1-/g  # moves zero in front of line 1 or 5-divisible
                       # lines for the decimal conversion and adds -
s/:I*/  /g             # removes numbers from other lines
s/ (\w)\1*/\1/g        # collapses groups of at least 1 into 1
                       # character, deleting the space before it
                       # so that only size-0-groups have spaces

And finally, the unary to decimal converter remains:

s/$/; 10123456789I0/
:c
s/(.)I(.*\1(I?.))|;.*/\3\2/
/\nI/s/^/ /Mg
tc

It basically appends a string where the knowledge of conversion is. You can interprete it as :space:->1 and 0->1->2->3->4->5->6->7->8->9->I0. The substitution expression s/(.)I(.*\1(I?.))|;.*/\3\2/ works similar to the sorting one, replacing the characters in front of I's [ (.)I ] by the character that is next to the one from the front of the I in the conversion string [ (.*\1(I?.)) ] and if there is no I left, it removes the appended string [ |;.* ]. The substitution [ /\nI/s/^/ /Mg ] adds padding if needed.

Thanks to Cows quack for reducing the size by 26 bytes and for the shorter sorting algorithm.

Sad Sed

Posted 8 years ago

Reputation: 61

Welcome to PPCG, and nice first answer! – user41805 – 8 years ago

You can use \w (matches word characters) in a number of places to save some bytes. Also :b ... tb can simply become s/\B\w/X/g. You can remove the line that follows it, s/:/:,/g, by modifying the earlier substitutions. You can look at https://goo.gl/JvD7Rs (shortened TIO link to sed program) to see what I mean.

– user41805 – 8 years ago

1You can improve on the sorting algorithm, hint: try appending zyx...cba to the input. – user41805 – 8 years ago

Brilliant unary to decimal converter! Yours is at least 30 bytes shorter than the one in Tips for golfing in sed

– user41805 – 8 years ago

5

Dyalog APL, 109 97 96 95 93 88 bytes

{⊖(⍉r),⍨⍕↑(⊂'')@{1@0~⍵∊1,5×⍵}⍳≢⍉↑r←↑r,⍨⊂' -','   - '⍴⍨5×⌈5÷⍨≢1↓⍉↑r←↓{⍺,∊⍕¨0×⍵}⌸⍵[⍋⍵]∩⎕A}

Try it online!

Requires ⎕IO←0

Way too many bytes saved thanks to Adám and Cows quack!

dzaima

Posted 8 years ago

Reputation: 19 048

For the last bit, you can try ⍵[⍋⍵]~' ' (sorts and removes spaces before passing through ) – user41805 – 8 years ago

'X'/⍨≢∊⍕¨× – Adám – 8 years ago

and ⍵>0 → ×⍵ – user41805 – 8 years ago

Your TIO link has an unnecessary Header. – Adám – 8 years ago

2⌷⍴≢⍉ twice – Adám – 8 years ago

~' '∩⎕A and take input in uppercase. – Adám – 8 years ago

and ¯1+≢≢1↓ – Adám – 8 years ago

¯1+⍳ and ×⍵0×⍵ with ⎕IO←0 – Adám – 8 years ago

{(1=⍵)∨(×⍵)∧0=5|⍵:⍵⋄''}¨(⊂'')@{1@0~⍵∊1,5×⍵} – Adám – 8 years ago

5

05AB1E, 58 47 bytes

áê©S¢Z5‰`Ā+L5*1¸ì'-«ð5×ý#À¦Áí.Bís'X×ζ‚ζJR»,®3ú,

Try it online!

-11 bytes thanks to @Emigna

Magic Octopus Urn

Posted 8 years ago

Reputation: 19 422

Maybe this could help? Don't have time to tie them together but maybe they can give some inspiration.

– Emigna – 8 years ago

@Emigna I'll have a look, definitely different than mine :). – Magic Octopus Urn – 8 years ago

@Emigna 57 bytes after I stitched it... given I didn't try too hard to optimize. Try it online!

– Magic Octopus Urn – 8 years ago

47 bytes with some restructuring and optimization. – Emigna – 8 years ago

That could be fixed at the cost of 1 byte

– Emigna – 8 years ago

3

Python 2, 192 bytes

s=input()
d={c:s.count(c)for c in s if' '<c}
h=-max(d.values())/5*-5
for y in range(h,-1,-1):print('%d-'%y*(y%5==2>>y)).rjust(len(`-h`))+''.join(('X '[y>v],k)[y<1]for k,v in sorted(d.items()))

Try it online!

Explanation

Line 2 computes the histogram values in a fairly straightforward way, discarding ' '.

Line 3 uses the trick of computing ceil(x/5) as -(-x/5): we round the maximal frequency up to the next multiple of 5 using the formula -x/5*-5. This is h.

Line 4 is a loop counting from h down to 0 inclusive, printing each row:

  • If y%5==2>>y we print a label. This is when y ∈ {1, 5, 10, 15, 20, …}

    (This formula could maybe be shorter. We just need something that's 1 or True for {1, 5, 10, …}, and 0 or False or even a negative integer for all other values of y.)

  • We right-justify the label (or empty space) into len(`-h`) spaces: this is a neat one-byte saving over len(`h`)+1!

  • Then, we print either X's and spaces for this row (if y ≥ 1) or the letters (if y = 0), running through key-value pairs in d in ascending order.

Lynn

Posted 8 years ago

Reputation: 55 648

1Nice tick creation with '%d-'%y*(y%5==2>>y). Do you mind if I use that in my answer? – dylnan – 8 years ago

-~-(y%5*~-y) works too but it's one byte longer unfortunately. – dylnan – 8 years ago

2

Charcoal, 62 bytes

≔E²⁷⟦⟧ηFθ⊞§η⌕βιι≔⊟ηθ≦LηP⭆β⎇§ηκιω↑↑ΦηιF÷⁺⁹⌈η⁵«≔∨×⁵ι¹ιJ±¹±ι←⮌⁺ι-

Try it online! Link is to verbose version of code. Explanation:

≔E²⁷⟦⟧η

Create a list of 27 lists.

Fθ⊞§η⌕βιι

Push each input character to the list corresponding to its position in the lowercase alphabet. Non-lowercase characters get pushed to the 27th list.

≔⊟ηθ

Discard the 27th element of the list.

≦Lη

Take the lengths of all the elements of the list.

P⭆β⎇§ηκιω

Print the lowercase letters corresponding to non-zero list elements.

↑↑Φηι

Print the non-zero list elements upwards. Since this is an array of integers, each integer prints as a (now vertical) line, each in a separate column.

F÷⁺⁹⌈η⁵«

Calculate the number of tick marks on the Y-axis and loop over them.

≔∨×⁵ι¹ι

Calculate the position of the next tick mark.

J±¹±ι

Jump to the next tickmark.

←⮌⁺ι-

Print the tickmark reversed and back-to-front, effectively right-aligning it.

Neil

Posted 8 years ago

Reputation: 95 035

2

Jelly, 48 bytes

What a mine-field to traverse!

J’⁶D;”-Ɗ%5^ỊƲ?€Uz⁶ZU
ḟ⁶ṢµĠ¬;⁶$L%5Ɗ¿€;"@Qz⁶Ç;"$ṚY

A full-program printing the result (as a monadic link it would return a list containing characters and integers from [0,9])

Try it online! Or see the test-suite

How?

J’⁶D;”-Ɗ%5^ỊƲ?€Uz⁶ZU - Link 1, get y-axis: list of columns (including x-axis & top-spaces)
J                    - range of length  [1,2,3,4,5,6,...,height+1] (+1 for x-axis)
 ’                   - decrement        [0,1,2,3,4,5,...] (line it up with the content)
             ?€      - if for €ach...
            Ʋ        - ...condition: last four links as a monad:
        %5           -   modulo by five
           Ị         -   insignificant? (1 for 0 and 1, else 0)
          ^          -   XOR (0 for 1 or multiples of 5 greater than 0, else 0)
  ⁶                  - ...then: literal space character
       Ɗ             - ...else: last three links as a monad:
   D                 -   decimal list of the number, e.g. 10 -> [1,0]
     ”-              -   literal '-' character
    ;                -   concatenate, e.g. [1,0,'-']
               U     - upend (reverse each)
                z⁶   - transpose with a filler of space characters
                  Z  - transpose
                   U - upend (i.e. Uz⁶ZU pads the left with spaces as needed)

ḟ⁶ṢµĠ¬;⁶$L%5Ɗ¿€;"@Qz⁶Ç;"$ṚY - Main link: list of characters
ḟ⁶                          - filter out space characters
  Ṣ                         - sort
   µ                        - start a new monadic chain, call that S
    Ġ                       - group indices of S by their values
     ¬                      - logical NOT (vectorises) (getting 0 for the X "characters")
             ¿€             - while for €ach...
            Ɗ               - ...condition: last three links as a monad:
         L                  -   length
          %5                -   modulo by five
        $                   - ...do: last two links as a monad:
      ;⁶                    -   concatenate a space character
                  Q         - deduplicate S (get the x-axis)
               ;"@          - zip with (") concatenation (;) with swapped arguments (@)
                   z⁶       - transpose a with filler of space characters
                        $   - last two links as a monad:
                     Ç      -   call last link (1) as a monad (get y-axis)
                      ;"    -   zip with concatenation (complete the layout)
                         Ṛ  - reverse (otherwise it'll be upside-down)
                          Y - join with newlines
                            - implicit print

Jonathan Allan

Posted 8 years ago

Reputation: 67 804

2

Java (JDK 10), 296 bytes

s->{int d[]=new int[26],m=0;char a;for(int c:s.getBytes())m=c>32&&++d[c-=65]>m?(d[c]+4)/5*5:m;String r=m+"-",z=r.replaceAll("."," ");for(;m>0;r+="\n"+(--m%5<1|m==1&&m>0?z.format("%"+~-z.length()+"s-",m):z))for(a=0;a<26;a++)r+=d[a]>0?m>d[a]?" ":"x":"";for(a=64;a++<90;)r+=d[a-65]>0?a:"";return r;}

Try it online!

Credits

Olivier Grégoire

Posted 8 years ago

Reputation: 10 647

@aoemica Correct. I fixed it. – Olivier Grégoire – 8 years ago

1It's not much, but you can save 2 bytes. --m%5==0 can be --m%5<1, because you also have the &m>0 check. And m<=d[a]?"x":" " can be m>d[a]?" ":"x". – Kevin Cruijssen – 8 years ago

@KevinCruijssen 2 bytes are 2 bytes! I don't think there is much to golf anymore, except for a different algorithm. – Olivier Grégoire – 8 years ago

11 more byte by changing (--m%5<1|m==1)&m>0 to --m%5<1|m==1&&m>0 – Kevin Cruijssen – 8 years ago

2

Ruby, 250 248 234 188 173 157 153 bytes

->s{a=s.scan(/\w/).sort|[]
m=-(c=a.map{|l|s.count l}).max/5*-5
m.downto(1).map{|i|(i%5<1||i<2?"#{i}-":'').rjust(m)+c.map{|l|l<i ?' ':?X}*''}<<' '*m+a*''}

Try it online!

Thanks to:

  • dylnan for -16 bytes with less strict padding
  • Lynn for -2 bytes by rounding up with -x/5*-5
  • Kirill L. for -2 bytes by getting unique array elements with |[]

Nnnes

Posted 8 years ago

Reputation: 395

2

APL (Dyalog Classic), 56 bytes

⊖h,⍨{∊⍺'-'/⍨~×5|⍵-⍵<2}⌸⍕⍪⍳≢h←(⊢↑⍨≢+5|1-≢)⍉(⊢,⌸⊣\)∊⍞∘∩¨⎕a

Try it online!

ngn

Posted 8 years ago

Reputation: 11 449

1

Pyth, 65 bytes

J.tm+ed*hd\Xr8S-Qd)=+J*]d%_tlJ5_.e+?q<k2%k5.F"{:{}d}-",klQ*dhlQbJ

Try it here

Explanation

J.tm+ed*hd\Xr8S-Qd)=+J*]d%_tlJ5_.e+?q<k2%k5.F"{:{}d}-",klQ*dhlQbJ
J.tm+ed*hd\Xr8S-Qd)
     Get the bars.
                   =+J*]d%_tlJ5
     Round up the height to the next number that's 1 mod 5.
                               _.e+?q<k2%k5.F"{:{}d}-",klQ*dhlQbJ
     Stick the axis labels on.

user48543

Posted 8 years ago

Reputation:

1

JavaScript (Node.js), 262 256 bytes

*Thanks to @Shaggy for reducing by 2 bytes

a=>[...a].map(x=>x>" "&&(d=c[x]=(c[x]||x)+"X")[m]?m=d.length-1:0,c={},m=i=0)&&Object.keys(c).sort().map(x=>[...c[x].padEnd(m)].map((l,j)=>A[m-j-1]+=l),A=[...Array(m+=6-m%5)].map(x=>(++i>=m||((D=m-i)%5&&m-i^1)?"":D+"-").padStart((m+" ").length)))&&A.join`
`

Try it online!

DanielIndie

Posted 8 years ago

Reputation: 1 220

Couple of quick savings I can spot on my phone: 1. Take input as an array of individual characters, 2. Replace x!=" " with x>" ". – Shaggy – 8 years ago

3. Replace m=0 with i=m=0 and map((x,i)=> with map(x=>. – Shaggy – 8 years ago

1

Python 2, 249 224 219 215 205 197 187 188 182 176 bytes

def f(s):S=sorted(set(s)^{' '});C=map(s.count,S);P=max(C)+4;return zip(*(zip(*[('%d-'%y*(y%5==2>>y)).rjust(P)for y in range(P,0,-1)])+[(n*'#').rjust(P)for n in C]))+[[' ']*P+S]

Try it online!

Returns a list of lists of characters representing lines.

  • Saved some bytes by including a lot of extra whitespace.
  • Had an unnecessary map(list,yticks) in there.
  • Changed space padding to save some bytes.
  • I thought I was sorting but I was not: +2 bytes. But I saved one independently at the same time. y==1 replaced by y<2.
  • -6 bytes thanks to Lynn by using '%d-'%y*(y%5==2>>y) instead of (`y`+'-')*(not y%5or y<2).

Slightly ungolfed:

def f(s):
	S=sorted(set(s)^{' '})  # sorted list of unique letters (without ' ')
	C=map(s.count,S)        # count of each unique letter in the input
	P=max(C)+4              # used for padding and getting highest y tick
	B=[(n*'#').rjust(P)for n in C]     # create bars
	yticks = [('%d-'%y*(y%5==2>>y)).rjust(P)for y in range(P,0,-1)]  # create y ticks at 1 and multiples of 5
	yticks = zip(*yticks)                      # need y ticks as columns before combining with bars
	return zip(*(yticks+B))+[[' ']*P+S]        # zip ticks+bars then add row of sorted unique letters.

dylnan

Posted 8 years ago

Reputation: 4 993

1

C# (.NET Core), 344 340 338 + 18 bytes

Includes 18 bytes for using System.Linq;

Saved 6 bytes thanks to @KevinCruijssen.

n=>{var l=n.Where(c=>c!=32).GroupBy(c=>c).OrderBy(c=>c.Key).ToDictionary(k=>k.Key,c=>c.Count());int h=(l.Values.Max()/5+1)*5,o=(h+"").Length+1,m=l.Keys.Count+o,t=h+1,i=0,j;var a=new string[t];for(string p,q;i<t;a[i++]=q)for(q=(p=i>0&i%5<1|i==1?i+"-":"").PadLeft(j=o);j<m;){var c=l.ElementAt(j++-o).Key;q+=i<1?c:l[c]>=i?'X':' ';}return a;}

Try it online!

Ian H.

Posted 8 years ago

Reputation: 2 431

You have a space at j< m; that can be removed. And int i=0,j can be placed as ,i=0,j after the other ints for -4 bytes in total. You'll have to including the 18 bytes for the using System.Linq; however.. – Kevin Cruijssen – 8 years ago

@KevinCruijssen Thanks, I missed these. And I added the 18 bytes. – Ian H. – 8 years ago

+1 from me. Oh, and you can save 2 more bytes by changing for(;i<t;){string p=i>0&i%5<1|i==1?i+"-":"",q=p.PadLeft(o);for(j=o;j<m;){...}a[i++]=q;} to for(string p,q;i<t;)for(p=i>0&i%5<1|i==1?i+"-":"",q=p.PadLeft(j=o);j<m;a[i++]=q){...}. Try it online.

– Kevin Cruijssen – 8 years ago

@KevinCruijssen Thats really clever, thanks! – Ian H. – 8 years ago

1

Bash + coreutils, 332 324 323 318 312 302 298 296 293 291 bytes

c()(cut -d\  -f$@)
p=printf
cd `mktemp -d`
grep -o [^\ ]<<<$@|sort|uniq -c|c 7->a
sort -k2<a>b
r=$[`c 1 <a|sort -n|tail -1`+5]
s=${#r}
t()($p \ ;((i++<s))&&t;i=)
for((;--r;));{
((r%5&&r>1))&&t||$p %${s}s- $r;IFS='
'
for l in `<b`;{ ((r<=`c 1 <<<$l`))&&$p X||$p \ ;}
echo
}
t
c 2 <b|tr -d \\n

Try it online!

Annotated:

c()(cut -d\  -f$@)
p=printf              # saving a few bytes

cd `mktemp -d`        # for temp files

grep -o [^\ ]<<<$@    # grabs all non-space characters
    |sort|uniq -c     # get character frequency
    |c 7->a           # slightly hacky way of stripping leading spaces;
                      #     uniq -c adds 6 spaces in front of each line

sort -k2<a>b          # store frequencies sorted alphabetically in b

r=$[`                 # r = highest frequency +5:
    c 1 <a            #     get frequencies
    |sort -n|tail -1  #     get maximum frequency
    `+5]              #     +4 so at least one multiple of 5 is
                      #     labeled, +1 because r gets pre-decremented

s=${#r}                    # s = length of r as a string
t()($p \ ;((i++<s))&&t;i=) # pad line with s+1 spaces

for((;--r;));{         # while (--r != 0)
    ((r%5&&r>1))&&     # if r isn't 1 or a multiple of 5
        t||            #     then indent line 
        $p %${s}s- $r; # otherwise print right-aligned "${r}-"
        IFS='
'                      # newline field separator
    for l in `<b`;{          # for all letters and their frequencies:
        ((r<=`c 1 <<<$l`))&& #     if frequency <= current height 
            $p X||           #         then print an X
            $p \ ;}          #     otherwise print a space
    echo
}
t # indent x-axis labels
c 2 <b|tr -d \\n # print alphabetically sorted characters

Thanks to @IanM_Matrix for saving 3 bytes.

user9549915

Posted 8 years ago

Reputation: 401

cat b could be <b saving 3 characters – IanM_Matrix1 – 8 years ago

0

C, 201 bytes

char c[256],*p,d;main(int a,char **b){for(p=b[1];*p;p++)++c[*p|32]>d&*p>64?d++:0;for(a=(d+4)/5*5;a+1;a--){printf(!a||a%5&&a!=1?"    ":"%3i-",a);for(d=96;++d>0;c[d]?putchar(a?32|c[d]>=a:d):0);puts(p);}}

Input is taken from the command line (first argument). Uses exclamation marks instead of X's to further reduce code size. Counter on the left is always three characters long.

Tested with GCC and clang.

Simon

Posted 8 years ago

Reputation: 141

for(p=b[1];*p;p++) can most likely be for(p=b[1]-1;*++p;), main(int a,char **b) could probably be golfed to m(a,b)char**b;. – Jonathan Frech – 8 years ago

Since a!=1 will be boolean, a%5&&a!=1? should be equivalent to a%5&a!=1? or a%5&&~-a. – Jonathan Frech – 8 years ago

0

JavaScript (ES8), 200 bytes

Takes input as an array of characters. Returns a string.

s=>(s.sort().join``.replace(/(\w)\1*/g,s=>a.push(s[0]+'X'.repeat(l=s.length,h=h<l?l:h)),h=a=[]),g=y=>y--?(y<2^y%5?'':y+'-').padStart(`${h}_`.length)+a.map(r=>r[y]||' ').join``+`
`+g(y):'')(h+=5-~-h%5)

Try it online!

Commented

s => (                    // s[] = input array of characters (e.g. ['a','b','a','c','a'])
  s.sort()                // sort it in lexicographical order (--> ['a','a','a','b','c'])
  .join``                 // join it (--> 'aaabc')
  .replace(/(\w)\1*/g,    // for each run s of consecutive identical letters (e.g. 'aaa'):
    s => a.push(          //   push in a[]:
      s[0] +              //     the letter, which will appear on the X-axis
      'X'.repeat(         //     followed by 'X' repeated L times
        L = s.length,     //     where L is the length of the run (--> 'aXXX')
        h = h < L ? L : h //     keep track of h = highest value of L
    )),                   //   initialization:
    h = a = []            //     h = a = empty array (h is coerced to 0)
  ),                      // end of replace() (--> a = ['aXXX','bX','cX'] and h = 3)
  g = y =>                // g = recursive function taking y
    y-- ?                 //   decrement y; if there's still a row to process:
      (                   //     build the label for the Y-axis:
        y < 2 ^ y % 5 ?   //       if y != 1 and (y mod 5 != 0 or y = 0):
          ''              //         use an empty label
        :                 //       else:
          y + '-'         //         use a mark
      ).padStart(         //     pad the label with leading spaces,
        `${h}_`.length    //     using the length of the highest possible value of y
      ) +                 //     (padStart() is defined in ECMAScript 2017, aka ES8)
      a.map(r => r[y]     //     append the row,
                 || ' ')  //     padded with spaces when needed
      .join`` + `\n` +    //     join it and append a linefeed
      g(y)                //     append the result of a recursive call
    :                     //   else:
      ''                  //     stop recursion
)(h += 5 - ~-h % 5)       // call g() with h adjusted to the next multiple of 5 + 1

Arnauld

Posted 8 years ago

Reputation: 111 334

0

Excel VBA, 316 bytes

An Anonymous VBE immediate window function that takes input from cell [A1] and outputs to the VBE immediate window.

For i=1To 26:Cells(2,i)=Len(Replace([Upper(A1)],Chr(i+64),11))-[Len(A1)]:Next:m=-5*Int(-[Max(2:2)]/5):l=Len(m)+1:For i=-m To-1:?Right(Space(l) &IIf(i=-1Xor i Mod 5,"",-i &"-"),l);:For j=1To 26:?IIf(Cells(2,j),IIf(Cells(2, j) >= -i, "X", " "),"");:Next:?:Next:?Spc(l);:For i=1To 26:?IIf(Cells(2,i),Chr(i+96),"");:Next

Ungolfed Version

Public Sub bar_graph()
    For i = 1 To 26
        ''  gather the count of the letter into cells
        Cells(2, i) = Len(Replace([Upper(A1)], Chr(i + 64), 11)) - [Len(A1)]
    Next
    m = -5 * Int(-[Max(2:2)] / 5)   ''  get max bar height
    l = Len(m) + 1                  ''  length of `m` + 1
    For i = -m To -1
        ''  print either a label or free space (y-axis)
        Debug.Print Right(Space(l) & IIf((i = -1) Xor i Mod 5, "", -i & "-"), l);
        For j = 1 To 26
            ''  print 'X' or ' ' IFF the count of the letter is positive
            If Cells(2, j) Then Debug.Print IIf(Cells(2, j) >= -i, "X", " ");
        Next
        Debug.Print                 ''  print a newline
    Next
    Debug.Print Spc(l);             ''  print spaces
    For i = 1 To 26
        ''  print the letters that were used (x-axis)
        Debug.Print IIf(Cells(2, i), Chr(i + 96), "");
    Next
End Sub

Taylor Scott

Posted 8 years ago

Reputation: 6 709

0

Perl 5 -n, 198 168 bytes

s/[a-z]/$\<++${$&}?$\=${$&}:0/eg;$\++while$\%5;$;=1+length$\++;printf"%$;s".'%s'x26 .$/,$\%5&&$\-1?"":"$\-",map$$_>=$\?X:$$_&&$",a..z while--$\;say$"x$;,map$$_&&$_,a..z

Try it online!

Xcali

Posted 8 years ago

Reputation: 7 671

0

Python 3, 177 bytes

lambda s:[[list(("%d-"%i*(i%5==2>>i)).rjust(len(q)))+["* "[s.count(c)<i]for c in q]for i in range(max(map(s.count,q))+4,0,-1)]+[[" "]*len(q)+q]for q in[sorted(set(s)-{' '})]][0]

Try it online!

This may not be the most byte-efficient approach in Python, but I really wanted to solve this with a "true one-liner" lambda.

Outputs a list of character lists. Abuses multiple leading newlines and spaces just like everybody else. It may actually be further reduced to 174 bytes if it is acceptable to wrap the result in another list, so that we could transfer the final [0] indexing to the footer.

Kirill L.

Posted 8 years ago

Reputation: 6 693