Character counts in source code


Write a program that outputs a list of the number of occurrences of each unique character in its source code.

For example, this hypothetical program {Source_Print_1}; should produce this output:

; 1
P 1
S 1
_ 2
c 1
e 1
i 1
n 1
o 1
p 1
r 2
t 1
u 1
{ 1
} 1

The formatting should match this example. No extraneous whitespace is allowed, except an optional final newline.

Your program may not read its own source code from the source file.

The characters listed must be in one of two orders. Either the order of character values in the character encoding used by your language (probably ASCII), or the order the characters appear in your source.

This question inspired by this comment by Jan Dvorak.


Posted 2015-05-29T17:03:49.923

Reputation: 5 758


A zero-length program would work in quite a few languages. Does this count as a standard loophole?

– Digital Trauma – 2015-05-29T20:38:10.973

2Let's go with... yes. – Sparr – 2015-05-29T21:09:39.857

@DigitalTrauma: Added to the list.

– Dennis – 2015-05-30T01:09:41.530

Can the code contain newlines? – jimmy23013 – 2015-05-30T03:00:45.207

1@user23013 good question. I did not consider newlines. I guess if you include them, I'd accept an answer that prints them out literally, so there would be one double-newline in the file somewhere. – Sparr – 2015-05-31T23:37:20.740



CJam, 14 bytes


Try it here.

Output is in the order they firstly appears:

{ 2
S 2
2 2
N 2
` 2
/ 2
} 2

It simply appends <SP>2<NL> to each character in {S2N`/}.


Posted 2015-05-29T17:03:49.923

Reputation: 34 042


///, 12 bytes

4 4


A big thank you goes to @user23013, who suggested this improvement over my CJam code, outgolfing his own, highest-scoring answer in the process.

The characters are sorted by appearance. This code works in any language that just prints its own source code under the given circumstances (PHP, ASP, etc.).

CJam, 20 bytes


This approach doesn't use any built-in character counting.

Try it online in the CJam interpreter.

How it works

''S5N e# Push a single quote, a space, the integer 5 and a linefeed.
'5S5N e# Push the character 5, a space, the integer 5 and a linefeed.
'NS5N e# Push the character N, a space, the integer 5 and a linefeed.
'SS5N e# Push the character S, a space, the integer 5 and a linefeed.


Posted 2015-05-29T17:03:49.923

Reputation: 196 637

5+1 for not using standard quine techniques. – Martin Ender – 2015-05-29T18:41:30.803

I really really hope this one remains tied for the lead. I'll happily give it the checkmark over its quine counterpart. – Sparr – 2015-05-30T01:45:01.103

Now newlines are allowed. I think this answer should be better merged into yours.

– jimmy23013 – 2015-06-01T00:03:14.157

@user23013: That's even shorter than your CJam answer. Thanks! – Dennis – 2015-06-01T00:25:45.490


CJam, 20 bytes


How it works

We first start off with one of the standard quine in CJam


which pushes the first block on stack, copies it, and runs the copy, which makes it print the source code itself finally.

Then we add the logic to compute the character count from the source code:

{`"_~"+                         e# At this point, we have the full source code with us
       $e`                      e# Sort to get similar characters together and run RLE to
                                e# get count of each character as [count char] array
          {    }%               e# Run each array element through this loop
           )S@N                 e# Pop the character, put a space, rotate the count after
                                e# space and then finally put a newline after the trio
                 }_~            e# Second half of the standard quine explained above

Try it online here


Posted 2015-05-29T17:03:49.923

Reputation: 25 836


Python 3.5.0b1, 107 73 bytes

s="t='s=%r;exec(s)'%s;[print(c,t.count(c))for c in sorted({*t})]";exec(s)

Rather than the usual string replacement quine, which requires writing everything twice, here's an exec quine.


Posted 2015-05-29T17:03:49.923

Reputation: 58 729


Mathematica, 101 bytes

Apply[Print[#1, " ", #2] &, Tally[Characters[StringJoin[ToString[#0, InputForm], "[];"]]], {1}] & [];

Unfortunately, I can't use any of the normal golfing tricks like removing whitespace, <> for StringJoin, # instead of #1, @ for prefix function calls or @@@ instead of Apply[...,{1}], because ToString[...,InputForm] thinks it has to pretty print everything...

This prints the characters in the order they first appear in the code. If I can assume that this isn't run in a REPL environment (which is rather unusual for Mathematica) I can save two bytes by omitting the two ;.

Martin Ender

Posted 2015-05-29T17:03:49.923

Reputation: 184 808

InputForm is annoying... OutputForm is better but it doesn't quote strings. – LegionMammal978 – 2015-05-29T21:34:37.547


Haskell, 178 bytes

main=putStr(unlines[s:' ':show t|(s,t)<-zip" \"'(),-0123456789:<=S[\\]aehilmnoprstuwz|"[3,3,3,3,3,41,4,1,6,19,12,5,5,2,2,2,2,3,2,2,2,3,3,3,2,2,2,4,2,2,4,2,3,2,5,5,3,2,2,2]])--178

Nothing fancy. All characters of the program are in a literal list (String). So are the frequencies. Zip both lists and print. Output:

" 3
' 3
( 3
) 3
, 41
- 4
0 1
1 6
2 19
3 12
4 5
5 5
6 2
7 2
8 2
9 2
: 3
< 2
= 2
S 2
[ 3
\ 3
] 3
a 2
e 2
h 2
i 4
l 2
m 2
n 4
o 2
p 3
r 2
s 5
t 5
u 3
w 2
z 2
| 2 


Posted 2015-05-29T17:03:49.923

Reputation: 34 639


Dart - 214 127

A direct version:

main(){print("  22\n\" 3\n( 3\n) 3\n1 3\n2 15\n3 8\n4 1\n5 2\n8 2\n; 2\n\\ 23\na 2\ni 3\nm 2\nn 23\np 2\nr 2\nt 2\n{ 2\n} 2");}

The "4" is just a fiddling factor to make the numbers add up. See/run on DartPad.

Original: Standard quine tactic, and Dart's function names are a little too long for good golfing.

main({m,v,q:r'''main({m,v,q:r''}'')''{m={};for(v in q.split(''))m[v]=m[v]==null?2:m[v]+2;m.forEach((k,v)=>print("$k $v"));}'''}){m={};for(v in q.split(''))m[v]=m[v]==null?2:m[v]+2;m.forEach((k,v)=>print("$k $v"));}

See/run it on DartPad.


Posted 2015-05-29T17:03:49.923

Reputation: 521


Haskell, 146 bytes

main=mapM putStrLn[a:" "++show s|a<-[' '..],s<-[sum[2|b<-show"main=mapM putStrLn[a: ++show s|a<-[' '..],s<-[sum[2|b<-show,a==b]],s>0]",a==b]],s>0]

Try it online!


" 4
' 4
+ 4
, 6
- 6
. 4
0 2
2 2
: 2
< 6
= 6
> 2
L 2
M 2
S 2
[ 8
] 8
a 10
b 4
h 4
i 2
m 6
n 4
o 4
p 4
r 2
s 12
t 4
u 4
w 4
| 4

(Plus an additional newline)


The code is

main=mapM putStrLn[a:" "++show s|a<-[' '..],s<-[sum[2|b<-show"<code>",a==b]],s>0]

where "<code>" is a string of the program code without the ".

a goes through the ascii characters starting with a space. sum[2|b<-show"<code>",a==b] counts how often the character appears in the string, with each occurrence counted twice. a:" "++show s builds a string of the current character, a space and the character count. Finally mapM putStrLn prints each string in the list with a trailing newline.

The hardest part was getting the count of " right. Using just b<-"<code>" would count zero quotation marks because there are none in the string. Using show"<code>" adds a " to the front and end to the string, resulting in a count of four. So I had to put two additional quotation marks in the code, so instead of the (shorter) a:' ':show s I used a:" "++show s.


Posted 2015-05-29T17:03:49.923

Reputation: 23 676