42
3
Some people insist on using spaces for tabulation and indentation.
For tabulation, that's indisputably wrong. By definition, tabulators must be used for tabulation.
Even for indentation, tabulators are objectively superior:
There's clear consensus in the Stack Exchange community.
Using a single space for indentation is visually unpleasant; using more than one is wasteful.
As all cod
e golfers know, programs should be as short as possible. Not only does it save hard disk space, compilation times are also reduced if less bytes have to be processed.By adjusting the tab width1, the same file looks different on each computer, so everybody can use his favorite indent width without modifying the actual file.
All good text editors use tabulators by default (and definition).
I say so and I'm always right!
Sadly, not everybody listens to reason. Somebody has sent you a file that is doing it wrongTM and you have to fix it. You could just do it manually, but there will be others.
It's bad enough that spacers are wasting your precious time, so you decide to write the shortest possible program to take care of the problem.
Task
Write a program or a function that does the following:
Read a single string either from STDIN or as a command-line or function argument.
Identify all locations where spaces have been used for tabulation or indentation.
A run of spaces is indentation if it occurs at the beginning of a line.
A run of two or more spaces is tabulation if it isn't indentation.
A single space that is not indentation may or may not have been used for tabulation. As expected when you use the same character for different purposes, there's no easy way to tell. Therefore, we'll say that the space has been used for confusion.
Determine the longest possible tab width1 for which all spaces used for tabulation or indentation can be replaced with tabulators, without altering the appearance of the file.
If the input contains neither tabulation, nor indentation, it is impossible to determine the tab width. In this case, skip the next step.
Using the previously determined tab width, replace all spaces used for tabulation or indentation with tabulators.
Also, whenever possible without altering the appearance of the file, replace all spaces used for confusion with tabulators. (If in doubt, get rid of spaces.)
Return the modified string from your function or print it to STDOUT.
Examples
All spaces of
a bc def ghij
are tabulation.
Each run of spaces pads the preceding string of non-space characters to a width of 5, so the correct tab width is 5 and the correct output2 is
a--->bc-->def->ghij
The first two spaces of
ab cde f ghi jk lm
are tabulation, the others confusion.
The correct tab width is 4, so the correct output2 is
ab->cde>f ghi>jk lm
The last space remains untouched, since it would be rendered as two spaces if replaced by a tabulator:
ab->cde>f ghi>jk->lm
All but one spaces of
int main( ) { puts("TABS!"); }
are indentation, the other is confusion.
The indentation levels are 0, 4 and 8 spaces, so the correct tab width is 4 and the correct output2 is
int --->main( ) --->{ --->--->puts("TABS!"); --->}
The space in
( )
would be rendered as three spaces if replaced by a tabulator, so it remains untouched.The first two spaces of
x yz w
are indentation, the others confusion.
The proper tab width is 2 and the correct output2 is
->x>yz w
The last space would be rendered as two spaces if replaced by a tabulator, so it remains untouched.
The first two spaces of
xy zw
are indentation, the other three are tabulation.
Only a tab width of 1 permits to eliminate all spaces, so the correct output2 is
>>xy>>>zw
All spaces of
a b c d
are confusion.
There is no longest possible tab width, so the correct output2 is
a b c d
Additional rules
The input will consist entirely of printable ASCII characters and linefeeds.
You may assume that there are at most 100 lines of text and at most 100 characters per line.
If you choose STDOUT for output, you may print a single trailing linefeed.
Standard code-golf rules apply.
1 The tab width is defined as the distance in characters between two consecutive tab stops, using a monospaced font.
2 The ASCII art arrows represent the tabulators Stack Exchange refuses to render properly, for which I have submitted a bug report. The actual output has to contain actual tabulators.
9+1 for finally putting this nonsensical space/tab issue to rest :D – Geobits – 2015-09-03T17:40:36.440
2
programs should be as short as possible
I believe I have found Arthur Whitney's long-lost brother!! – kirbyfan64sos – 2015-09-03T17:45:48.1806
@Dennis "That said, only a moron would use tabs to format their code." Clear consensus, eh?
– primo – 2015-09-03T19:26:25.78313Tabs are unholy demonspawn that deserve to have their bits ripped apart and their ASCII code disgraced until their incompetent lack-of-a-soul has been thoroughly ground into a pulp. Errr, I mean, +1, nice challenge, even though it reeks of blasphemy. ;) – Doorknob – 2015-09-04T02:42:40.527
1I was crying each time a colleague add a tab in my beautiful space indented code. Then I discovered CTRL+K+F in Visual Studio. I do it each time I open a modified file. My life is better now. – Michael M. – 2015-09-04T09:03:02.777
Let's use 4 spaces for the 1st level and a tab for the 2nd, or 1 space for the 1st level for code golfing.
– jimmy23013 – 2015-09-04T09:11:30.953I don't understand "The last space remains untouched, since it would be rendered as two spaces if replaced by a tabulator:" in your second example. Why is it
ghi>jk lm
and notghi>jk>lm
orghi jk lm
, when both are confusion spaces? – Fatalize – 2015-09-04T09:15:21.027@Fatalize Because a tabulator advances always to the next tab stop. With a tab width of 4, a single tabulator can replace 1, 2, 3 or even 4 spaces, depending on where it occurs. – Dennis – 2015-09-04T13:33:23.947
Emacs-lisp, 8 bytes:
(tabify)
(not really, of course, but this is close). – coredump – 2015-09-04T19:30:52.107