Because we are talking about indentation and whitespaces we just have to write the code in a programming a language that is actually designed around whitespace, as that has to be easiest, right?
So the solution is:
Here it is in base64:
ICAgCQogICAgCgkJICAgIAkgCiAgICAKCQkgICAgCQkKICAgIAoJCSAgICAJICAKICAgIAoJCSAgICAJIAkKICAgIAoJCSAKIAogCQoKICAgCSAKICAgCQoJCQkgICAJCQkgCQkKCSAgCQoJICAJICAKICAgCQoJCQkgICAJIAkgCgkgIAkKCSAgCSAJCiAgIAkKCQkJICAgCSAgCQoJICAJCgkgIAkgCQogICAJCgkJCSAgIAkgICAgIAoJICAJCgkgIAkgCQogICAJCgkJCSAgIAkJCQkJIAkKCSAgCQoJICAJCSAKICAgCQoJCQkgICAJICAgCSAKCSAgCQoJICAJICAgIAogICAJCgkJCSAgIAkgCSAgIAoJICAJCgkgIAkgICAJCQogICAJCgkJCSAgIAkgCSAgCQoJICAJCgkgIAkgIAkgIAogICAJIAogICAJCgkJIAogCSAJCQkKICAgCQoJCQkJCiAgCgkKCiAgIAkgIAogICAJCgkJCQkKICAKCQoKICAgCSAJCgoJCgogICAJCSAKICAgCQkKICAgCQkKCQkJICAgCQoJICAJCQkgICAgCSAKICAgCQoJCSAKIAkgCQkJCiAgIAkKCQkJCQogIAoJCgogICAJICAgIAogICAJIAogICAJIAoJCSAKIAkgCQkJCiAgIAkKCQkJCQogIAoJCgogICAJICAgCQkKICAgCSAJCiAgIAkgCQoJCQkgICAJCgkgICAJCSAgICAJIAogICAJCgkJIAogCSAJCQkKICAgCQoJCQkJCiAgCgkKCiAgIAkgIAkgIAogICAJIAkKICAgCSAJCgkJCSAgIAkKCSAgCQkJICAgIAkgCiAgIAkKCQkgCiAJIAkJCQogICAJCgkJCQkKICAKCQoKICAgCQkKICAgCQoJCQkgICAJIAkgCgkgIAkKCSAgCQkgIAogICAJCgkJCSAgIAkJCQkgCQkKCSAgCQoJICAJCSAJCiAgIAkKCQkJICAgCQkJCQkgCQoJICAJCgkgIAkJCSAKICAgCQoJCQkgICAJCQkgCQkKCSAgCQoJICAJCQkJCiAgIAkKCQkJICAgCSAgIAkgCgkgIAkKCSAgCQkgCQkKICAgCQoJCQkgICAJIAkgCSAKCSAgCQoJICAJCQkgIAogICAJCgkJCSAgIAkgCQkJCQoJICAJCgkgIAkJCSAJCiAgIAkKCQkJICAgCSAJICAgCgkgIAkKCSAgCSAgCSAJCiAgIAkKCQkJICAgCSAJICAJCgkgIAkKCSAgCSAgCQkgCiAgIAkKCQkJCQogIAoJCgogICAJCSAgCiAgIAkgCiAgICAKCQkgCgkKCiAgIAkJIAkKICAgCQoJCQkJCiAgICAgCSAKICAgIAoJCSAgICAJCQogICAJCQoJCQkgICAJCgkgICAJCSAKCQoKICAgCQkJIAogICAJCQogICAJCQoJCQkgICAJCgkgIAkJCSAKIAkgCQkJCiAgIAkKCQkJCQogICAgIAkgCiAgICAKCQkgCgkKCiAgIAkJCQkKICAgCQoJCQkJCiAgICAgCSAJCgkJCQoJICAJIAkgCgoJCgogICAJICAJCQkKICAgCSAKICAgIAoJCSAKCQoKICAgCQkgCQkKICAgCQoJCQkJCiAgICAgCSAKICAgCSAKCQkgCgkKCiAgIAkJCSAgCiAgIAkgIAoJCQkgICAJIAkJCQkKCSAgCQoJICAJCQkJIAogICAJCgkJCQkKICAKCQoKICAgCQkJIAkKICAgCSAgCgkJCSAgIAkgCQkJCQoJICAJCgkgIAkJCQkJCiAgIAkKCQkJCQogIAoJCgogICAJCQkJIAogICAJCgkJCQkKICAgICAJIAogICAJCQoJCSAKCQoKICAgCQkJCQkKICAgCQoJCQkJCiAgICAgCSAKICAgCSAgCgkJIAoJCgogICAJICAJIAkKICAgCSAJCiAgIAkgCQoJCQkgICAJCgkgICAJCSAgICAJCgkJCQkKICAKCQoKICAgCSAgCQkgCiAgIAkgCQogICAJIAkKCQkJICAgCQoJICAJCQkgICAgCQoJCQkJCiAgCgkKCiAgIAkJCQogICAJIAkgCgkKICAgICAJCQoJCQkKICAgCSAgCQogCiAKCSAgCSAgIAogCiAKCQkgCSAgIAogICAJICAgICAKCQogICAgIAkgICAgIAoJCiAgICAgCQoJICAJCiAKIAkgIAkKCiAgIAkgICAKCgkKCiAgIAkgCSAJCiAgIAkKCQkJICAgCSAJIAoJICAJCgkgIAkJICAgCiAgIAkKCQkJICAgCSAgIAkgCgkgIAkKCSAgCQkgIAkKICAgCQoJCQkJCiAgCgkKCiAgIAkJICAgCiAgIAkgCiAgICAKCQkgCgkKCiAgIAkJICAJCiAgIAkgIAoJCQkgICAJIAkJCSAgCgkgIAkKCSAgCQkgCSAKICAgCSAKICAgCQoJCSAgICAJCgkJCQkKICAKCQoKICAgCQkgCSAKICAgCQoJCQkJCiAgCgkKCiAgIAkgCQkgCiAgIAkKCQkJICAgCSAJCQkJCgkgIAkKCSAgCSAgICAgCiAgIAkKCQkJCQogIAoJCgogICAJICAgICAKICAgCSAgCgkJCSAgIAkgCSAJIAoJICAJCgkgIAkgICAgCQogICAJCgkJCQkKICAKCQoKICAgCSAgICAJCiAgIAkKCQkJCQogICAgIAkgCiAgIAkKCQkgCgkKCiAgIAkgCQkJCiAgIAkKCQkJICAgCSAJIAoJICAJCgkgIAkgICAJIAogICAJCgkJCQkKICAKCQoKICAgCSAgIAkgCiAgIAkgCiAgICAKCQkgCgkKCiAgIAkKICAgCSAgCiAgIAkKCQkJCQkgICAgCQoJCgkgICAgCSAKCQkJCgkgIAkgCSAKICAgCSAKCQkJICAgCQoJICAJCgkgIAkgICAJCiAgIAkgCgkJCSAgIAkgCgkgIAkKCSAgCSAgCSAKICAgCSAKCQkJICAgCQkKCSAgCQoJICAJICAJCQogICAJIAoJCQkgICAJICAKCSAgCQoJICAJIAkgIAoKICAgCSAgIAkKCiAJIAkJCgogCiAJIAkJCgogICAJIAkgCgogCSAJIAoKIAogCSAJCQoKICAgCSAgCSAKCiAJIAkgCSAJCgogCiAJIAkJCgogICAJICAJCQoKIAkgCSAJCSAKCiAKIAkgCQkKCiAgIAkgCSAgCgogCSAJIAkJCQoKICAgCSAJCQoKIAogCQoKCgo=
For those who have issues printing out the code on a paper here is the annotated version (you can find a compiler for this at the end of the answer):
# heap structure:
# 1: read buffer
# 2: parser state
# 0: before indentation
# 1: after indentation
# 2: inside a string literal
# 3: inside a multiline comment
# 4: inside a single line comment
# 3: identation
# 4: old read buffer
# 5: parenthesis nesting amount
# -------------------
# initialize heap
# -------------------
SS 1 | SS 0 | TTS # [1] := 0
SS 2 | SS 0 | TTS # [2] := 0
SS 3 | SS 0 | TTS # [3] := 0
SS 4 | SS 0 | TTS # [4] := 0
SS 5 | SS 0 | TTS # [5] := 0
LSL 1 # goto L1
# -------------------
# sub: determine what to do in state 0
# -------------------
LSS 2 # LABEL L2
SS 1 | TTT | SS 59 | TSST | LTS 4 # if [1] == ; GOTO L4
SS 1 | TTT | SS 10 | TSST | LTS 5 # if [1] == \n GOTO L5
SS 1 | TTT | SS 9 | TSST | LTS 5 # if [1] == \t GOTO L5
SS 1 | TTT | SS 32 | TSST | LTS 5 # if [1] == ' ' GOTO L5
SS 1 | TTT | SS 125 | TSST | LTS 6 # if [1] == } GOTO L6
SS 1 | TTT | SS 34 | TSST | LTS 16 # if [1] == " GOTO L16
SS 1 | TTT | SS 40 | TSST | LTS 35 # if [1] == ( GOTO L35
SS 1 | TTT | SS 41 | TSST | LTS 36 # if [1] == ) GOTO L36
SS 2 | SS 1 | TTS # [2] := 1
LST 7 # call L7
SS 1 | TTT | TLSS # print [1]
LTL # return
LSS 4 # label L4 - ; handler
SS 1 | TTT | TLSS # print [1]
LTL # return
LSS 5 # label L5 - WS handler
LTL # return
LSS 6 # label L6 - } handler
# decrease identation by one
SS 3 | SS 3 | TTT | SS 1 | TSST | TTS # [3] := [3] - 1
SS 2 | SS 1 | TTS # [2] := 1
LST 7 # call L7
SS 1 | TTT | TLSS # print [1]
LTL # return
LSS 16 # label L16 - " handler
SS2 | SS 2 | TTS # [2] := 2
LST 7 # call L7
SS1 | TTT | TLSS # print [1]
LTL
LSS 35
SS 5 | SS 5 | TTT | SS 1 | TSSS | TTS # [5] := [5] + 1
SS 2 | SS 1 | TTS # [2] := 1
LST 7 # call L7
SS1 | TTT | TLSS # print [1]
LTL
LSS 36
SS 5 | SS 5 | TTT | SS 1 | TSST | TTS # [5] := [5] - 1
SS 2 | SS 1 | TTS # [2] := 1
LST 7 # call L7
SS1 | TTT | TLSS # print [1]
LTL
# -------------------
# sub: determine what to do in state 1
# -------------------
LSS 3 # LABEL L3
SS 1 | TTT | SS 10 | TSST | LTS 12 # if [1] == \n GOTO L12
SS 1 | TTT | SS 123 | TSST | LTS 13 # if [1] == { GOTO L13
SS 1 | TTT | SS 125 | TSST | LTS 14 # if [1] == } GOTO L14
SS 1 | TTT | SS 59 | TSST | LTS 15 # if [1] == ; GOTO L15
SS 1 | TTT | SS 34 | TSST | LTS 27 # if [1] == " GOTO L27
SS 1 | TTT | SS 42 | TSST | LTS 28 # if [1] == * GOTO L28
SS 1 | TTT | SS 47 | TSST | LTS 29 # if [1] == / GOTO L29
SS 1 | TTT | SS 40 | TSST | LTS 37 # if [1] == ( GOTO L37
SS 1 | TTT | SS 41 | TSST | LTS 38 # if [1] == ) GOTO L38
SS 1 | TTT | TLSS # print [1]
LTL # return
LSS 12 # LABEL L12 - \n handler
SS 2 | SS 0 | TTS # [2] := 0
LTL # return
LSS 13 # LABEL L13 - { handler
SS 1 | TTT | TLSS # print [1]
SS 2 | SS 0 | TTS # [2] := 0
SS 3 | SS 3 | TTT | SS 1 | TSSS | TTS # [3] := [3] + 1
LTL # return
LSS 14 # LABEL L14 - } handler
SS 3 | SS 3 | TTT | SS 1 | TSST | TTS # [3] := [3] - 1
LST 7 # call L7
SS 1 | TTT | TLSS # print [1]
SS 2 | SS 0 | TTS # [2] := 0
LTL # return
LSS 15 # LABEL L15 - ; handler
SS 1 | TTT | TLSS # print [1]
SS 5 | TTT | LTS 10 # if [5] == 0 GOTO L39
LTL
LSS 39
SS 2 | SS 0 | TTS # [2] := 0
LTL # return
LSS 27 # label L27 - " handler
SS1 | TTT | TLSS # print [1]
SS2 | SS 2 | TTS # [2] := 2
LTL
LSS 28 # label L28 - * handler - this might start a comment
SS 4 | TTT | SS 47 | TSST | LTS 30 # if [4] == / GOTO L30
SS1 | TTT | TLSS # print [1]
LTL
LSS 29 # label L29 - / handler - this might start a comment
SS 4 | TTT | SS 47 | TSST | LTS 31 # if [4] == / GOTO L31
SS1 | TTT | TLSS # print [1]
LTL
LSS 30 # label L30 - /* handler
SS1 | TTT | TLSS # print [1]
SS2 | SS 3 | TTS # [2] := 3
LTL
LSS 31 # label L31 - // handler
SS1 | TTT | TLSS # print [1]
SS2 | SS 4 | TTS # [2] := 4
LTL
LSS 37
SS 5 | SS 5 | TTT | SS 1 | TSSS | TTS # [5] := [5] + 1
SS1 | TTT | TLSS # print [1]
LTL
LSS 38
SS 5 | SS 5 | TTT | SS 1 | TSST | TTS # [5] := [5] - 1
SS1 | TTT | TLSS # print [1]
LTL
# -------------------
# sub: print identation
# -------------------
LSS 7 # label L7 - print identation
SS 10 | TLSS # print \n
SS 3 | TTT # push [3]
LSS 9 # label L9 - start loop
SLS | LTS 8 # if [3] == 0 GOTO L8
SLS | LTT 8 # if [3] < 0 GOTO L8 - for safety
SS 32 | TLSS # print ' '
SS 32 | TLSS # print ' '
SS 1 | TSST # i := i - 1
LSL 9 # GOTO L9
LSS 8 # label L8 - end loop
LTL #
# -------------------
# sub: L21 - string literal handler
# -------------------
LSS 21
SS 1 | TTT | SS 10 | TSST | LTS 24 # if [1] == \n GOTO L24
SS 1 | TTT | SS 34 | TSST | LTS 25 # if [1] == " GOTO L25
SS 1 | TTT | TLSS # print [1]
LTL
LSS 24 # \n handler - this should never happen, but let's be prepared and reset the parser
SS 2 | SS 0 | TTS # [2] := 0
LTL # return
LSS 25 # " handler - this might be escaped, so be prepared
SS 4 | TTT | SS 92 | TSST | LTS 26 # if [4] == \ GOTO L26
SS 2 | SS 1 | TTS # [2] := 1
SS 1 | TTT | TLSS # print [1]
LTL
LSS 26 # \\" handler - escaped quotes don't finish the literal
SS 1 | TTT | TLSS # print [1]
LTL
# -------------------
# sub: L22 - multiline comment handler
# -------------------
LSS 22
SS 1 | TTT | SS 47 | TSST | LTS 32 # if [1] == / GOTO L32
SS 1 | TTT | TLSS # print [1]
LTL
LSS 32
SS 4 | TTT | SS 42 | TSST | LTS 33 # if [4] == * GOTO L33
SS 1 | TTT | TLSS # print [1]
LTL
LSS 33
SS 1 | TTT | TLSS # print [1]
SS 2 | SS 1 | TTS # [2] := 1
LTL
# -------------------
# sub: L23 - singleline comment handler
# -------------------
LSS 23
SS 1 | TTT | SS 10 | TSST | LTS 34 # if [1] == \n GOTO L34
SS 1 | TTT | TLSS # print [1]
LTL
LSS 34
SS 2 | SS 0 | TTS # [2] := 0
LTL
# -------------------
# main loop
# -------------------
LSS 1 # LABEL L1
SS 4 | SS 1 | TTT | TTS # [4] := [1]
SS 1 | TLTS # [1] := read
SS 2 | TTT | LTS 10 # if [2] == 0 GOTO L10
SS 2 | TTT | SS 1 | TSST | LTS 17 # if [2] == 1 GOTO L17
SS 2 | TTT | SS 2 | TSST | LTS 18 # if [2] == 2 GOTO L18
SS 2 | TTT | SS 3 | TSST | LTS 19 # if [2] == 3 GOTO L19
SS 2 | TTT | SS 4 | TSST | LTS 20 # if [2] == 4 GOTO L20
LSS 17
LST 3 # call L3
LSL 11 # GOTO L11
LSS 10 # label L10
LST 2 # call L2
LSL 11
LSS 18
LST 21
LSL 11
LSS 19
LST 22
LSL 11
LSS 20
LST 23
LSS 11 # label L11
LSL 1 # goto L1
LLL # END
This is still work in progress, although hopefully it should pass most of the criterias!
Currently supported features:
- fix identation based on the
{
and }
characters.
- add a newline after
;
- handle indentation characters inside string literals (including the fact that string literals are not closed when encountering a
\"
)
- handle indentation characters inside single and multiline comments
- doesn't add newline characters if inside parentheses (like a
for
block)
Example input (I added some edge cases based Quincunx's comment, so you can check that it behaves properly):
/* Hai Worldz. This code is to prevent formatting: if(this_code_is_touched){,then+your*program_"doesn't({work)}correctl"y.} */
#include<stdio.h>
#include<string.h>
int main() {
int i;
char s[99];
printf("----------------------\n;;What is your name?;;\n----------------------\n\""); //Semicolon added in the {;} string just to annoy you
/* Now we take the {;} input: */
scanf("%s",s);
for(i=0;i<strlen(s);i++){if(s[i]>='a'&&s[i]<='z'){
s[i]-=('a'-'A'); //this is same as s[i]=s[i]-'a'+'A'
}}printf("Your \"name\" in upper case is:\n%s\n",s);
return 0;}
Example output:
[~/projects/indent]$ cat example.c | ./wspace indent.ws 2>/dev/null
/* Hai Worldz. This code is to prevent formatting: if(this_code_is_touched){,then+your*program_"doesn't({work)}correctl"y.} */
#include<stdio.h>
#include<string.h>
int main() {
int i;;
char s[99];;
printf("----------------------\n;;What is your name?;;\n----------------------\n\"");; //Semicolon added in the {;} string just to annoy you
/* Now we take the {;} input: */
scanf("%s",s);;
for(i=0;i<strlen(s);i++){
if(s[i]>='a'&&s[i]<='z'){
s[i]-=('a'-'A');; //this is same as s[i]=s[i]-'a'+'A'
}
}
printf("Your \"name\" in upper case is:\n%s\n",s);;
return 0;;
}
Note, that because whitespace doesn't support EOF checking the intepreter throws an exception, which we need to suppress. As there is no way in whitespace to check for EOF (as far as I know as this is my first whitespace program) this is something unavoidable, I hope the solution still counts.
This is the script I used to compile the annotated version into proper whitespace:
#!/usr/bin/env ruby
ARGF.each_line do |line|
data = line.gsub(/'.'/) { |match| match[1].ord }
data = data.gsub(/[^-LST0-9#]/,'').split('#').first
if data
data.tr!('LST',"\n \t")
data.gsub!(/[-0-9]+/){|m| "#{m.to_i<0?"\t":" "}#{m.to_i.abs.to_s(2).tr('01'," \t")}\n" }
print data
end
end
To run:
./wscompiler.rb annotated.ws > indent.ws
Note that this, apart from converting the S
, L
and T
characters, also allow single line comments with #
, and can automatically convert numbers and simple character literals into their whitespace representation. Feel free to use it for other whitespace projects if you want
2Any reasons for the down votes? – user12205 – 2014-02-07T20:46:43.963
1I could remove all unnecessary whitespace (
s/\s+/ /
) and call it a day – ratchet freak – 2014-02-10T10:06:48.8731@ratchet freak you could, but I don't think this would be an answer with a lot of upvotes. – user12205 – 2014-02-10T16:52:58.580
I think you should provide an example of input. – None – 2014-02-13T08:49:44.967
@BenH You mean you haven't heard of C?! Well, here is a sample program: (I think the
#include<stdio.h>
needs to be on its own line for this to work)/* Hai Worldz. This code is to prevent formatting: if(this_code_is_touched){,then+your*program_"doesn't({work)}correctl"y.} */ #include<stdio.h> main(){printf("Hello World");}
– Justin – 2014-02-13T09:13:00.193No, I mean that if we have to write an indenter, at least provide in the main post a good example of C code containing every kind of cases we could see. Like a for loop, while, do while, if else, function, pointers maybe (but if students can't even indent, I guess they don't know what's a pointer), single and multi lines comments etc... I hope it's more clear. – None – 2014-02-13T09:33:53.193
Can you give some example C code? I do not program C but I would like to join the contest. – CousinCocaine – 2014-02-13T10:17:46.333
1@Quincunx thanks for setting up the bounty. I thought my question sucked. – user12205 – 2014-02-13T11:58:57.467
@BenH and CousinCocaine there you go. Also, just to clarify, your program should be able to implement the features in ALL valid C code, and you are recommended to implement as many features as you can. But keep in mind that this is a popularity contest, i.e. the answer with most votes wins, so essentially there is only one requirement: impress your fellow code-golfers. – user12205 – 2014-02-13T12:28:15.050
Seriously? I'd download gnu-indent if not on a unix box – Tom Tanner – 2014-02-13T13:23:08.513
@TomTanner, I posted your solution as an answer, but seriously I think this is not a 'problem' at all, tons of solutions out there that do not need a Bounty or a program to be written. We are reinventing the wheel... ;) – CousinCocaine – 2014-02-13T14:18:38.903
@ace No problem; I thought this question was great. I was sad to see it stay in the dark for so long (or were you hoping to get tumbleweed?). – Justin – 2014-02-13T18:30:56.037
1@CousinCocaine: this is a popularity context. They are not designed to solve the problem in an elegant, or standard way. They are designed to solve the problem in the coolest way you can possibly imagine. – SztupY – 2014-02-13T19:36:11.900
@SztupY, you are right and my sincere excuses for the demotivating words! – CousinCocaine – 2014-02-14T08:48:29.177