Expand elastic tabs

5

1

Background

The tabs versus spaces war in programming has been going on a long time, basically because spaces are too low-level to have all the properties you'd want for alignment and indentation, but tabs can't be relied upon to work in all contexts (with some programs optimizing the use of tabs for indentation whilst making them unusable for tabulation, some optimizing the use of tabs for tabulation whilst making them mostly unusable for indentation, and pretty much all programs unable to reasonably use tabs for alignment).

A proposed solution was that of elastic tabstops; basically a method of dynamically adapting the meaning of a tab character so that it would be usable for indentation, tabulation, and alignment. The idea is that a tab in one line tabs to the same place as the corresponding tabs in neighbouring lines (if any exist), stretching if necessary to make the columns line up. Unfortunately, very few programs support them by default, meaning that they aren't widely used. (If only the "elastic tab" character were in Unicode!) In this task, we're bringing elastic tabstops to the world of codegolf.

The task

Brief description

Replace tabs with a minimal number of spaces, in such a way that for each n, the rightmost end of the nth tab on any given line is the same as the rightmost end of the nth tab on the neighbouring lines (assuming those tabs exist), and such that tabs tab to positions at least two spaces apart.

Precise description

Write a program or function whose input and output are multiline strings (you may take these as lists of single-line strings if you prefer). The input and output should be identical, except that each tab character (ASCII/Unicode 9) must be replaced with one or more spaces, subject to the following conditions:

  • Create a list of numbers corresponding to each line of output (its tabstops); specifically, for each tab character that was expanded on that line, take the column number of the last space that was expanded from that tab (here, "column number" = the number of characters on that line up to and including that character). So for example, if you expanded a␉b␉c to a b c, the list would be [4,7]. The lists must have the following properties:
    • For each pair of consecutive lines, one of those lines' tabstops list must be a prefix of the other's (a list is a prefix of itself, e.g. [4,7] and [4,7] is OK, as are [4,7,9] and [4,7], as are [4] and [4,7], but [4,5] and [4,7] would not be allowed).
    • For each number in each list, it must be greater by at least 2 than the number to its left (if it's the first element, treat the hypothetical "zeroth element" to its left as having a value of 0). (We're using a value of 2 for the purposes of this challenge because it gives good results for tabulation and alignment and decent results for indentation. Sorry, 4-space or 8-space indentation fans.)
  • The answer produced must be as short as possible while complying with the above restriction.

Example

Input

represents a literal tab character, because literal tabs don't show up well on Stack Exchange.

// Elastic tabs handle indentation...
{
␉foo;
␉{
␉␉bar;
␉}
}
// alignment...
int␉foo␉(int),
␉bar␉(void),
␉baz;
float␉quux␉(float),
␉garply;
// tabulation...
␉1␉2␉3␉4
1␉1␉2␉3␉4
2␉2␉4␉6␉8
3␉3␉6␉9␉12
4␉4␉8␉12␉16
// and all three at once.
while True:
␉numbers␉=␉one␉and␉two
␉␉and␉three␉or␉four

Output

I've added extra information to the right of this output to show the tabstops on each line. Those aren't part of the expected output, they're just there to help explain what's going on.

// Elastic tabs handle indentation...       []
{                                           []
  foo;                                      [2]
  {                                         [2]
    bar;                                    [2,4]
  }                                         [2]
}                                           []
// alignment...                             []
int   foo (int),                            [6,10]
      bar (void),                           [6,10]
      baz;                                  [6]
float quux (float),                         [6,11]
      garply;                               [6]
// tabulation...                            []
  1 2 3  4                                  [2,4,6,9]
1 1 2 3  4                                  [2,4,6,9]
2 2 4 6  8                                  [2,4,6,9]
3 3 6 9  12                                 [2,4,6,9]
4 4 8 12 16                                 [2,4,6,9]
// and all three at once.                   []
while True:                                 []
  numbers =   one   and two                 [2,10,14,20,24]
          and three or  four                [2,10,14,20,24]

Clarifications

  • The input won't contain a tab character at the end of a line (basically because this is an uninteresting case which wouldn't make a visual difference to the output).
  • The input will only contain printable ASCII (including space), plus newline and tab. As such, it's up to you whether you treat the input as bytes or characters; the two will be equivalent.
  • This is, in its spirit, a challenge about formatting output for display on the screen. As such, this challenge uses the rules for challenges (e.g. you're allowed to output a string out of order via the use of terminal cursor motion commands if you wish), even though it technically isn't about ASCII art.
  • Despite the previous point, outputting the answer via displaying it on the screen isn't necessary (although you can certainly output like that if you want to); you can use any method of outputting a string that PPCG allows by default (e.g. returning it from a function).
  • The input will not contain trailing whitespace (except for, if you wish, a single trailing newline). Trailing whitespace on the output will be irrelevant from the point of view of correctness (i.e. you can add trailing whitespace or leave it off, it doesn't matter). Again, this is because it wouldn't show up on a screen.

Victory condition

This is , so shorter is better.

user62131

Posted 2017-05-21T02:30:03.520

Reputation:

For people who can see deleted posts: the Sandbox post was here.

– None – 2017-05-21T02:30:46.930

It seems to me that the example has a space too much before each ( in the alignment section. And the output for public:␉char␉c; is missing. – Ørjan Johansen – 2017-05-21T03:40:42.627

@ØrjanJohansen: thanks, looks like some typos made as I was fixing the test cases before, fixed now. – None – 2017-05-21T17:08:52.210

Answers

2

Retina, 132 130 bytes

+m`(^|	)	
$1 	
{+sm`^(([^	¶])*)(	.*^(?<-2>[^	¶])*(?(2)(?!))[^	¶]+	)
$1 $3
+sm`^(([^	¶])*	.*^(?<-2>[^	¶])*(?(2)|(?!)))	
$1 	
%1`	

Try it online! Link includes test suite. Explanation:

+m`(^|␉)␉
$1␠␉

Insert spaces before leading tabs and between consecutive tabs. This ensures tabs are at least two spaces apart. Then while tabs still exist:

{   Repeat these stages until all tabs have been processed
 +  Repeat this stage until the first line has the longest indentation
  sm`^(([^␉¶])*)(␉    Look for a line with a tab
                   .*^     Look for another line
                      (?<-2>[^␉¶])*(?(2)(?!))[^␉¶]+␉)   With more indentation
$1␠$3   Add indentation to the first line

Find the line with the most text before a tab and add spaces to the first line until it aligns. (If the first line already has the longest indentation, this might add spaces to another line, but the next stage would have added them anyway.)

+   Repeat this stage until all lines have the same indentation
 sm`^(([^␉¶])*␉ Look for a line with a tab
                .*^  Look for another line
                   (?<-2>[^␉¶])*(?(2)|(?!)))␉    With less indentation
$1␠␉    Add indentation to that line

Add spaces to the remaining lines until they all align.

%1`␉
␠

Replace the first tab on each line with a space.

Neil

Posted 2017-05-21T02:30:03.520

Reputation: 95 035

This gives the wrong output for a couple of lines (most notably the bar; line). I'm not sure how easily fixable that is. – None – 2017-05-21T17:10:11.233

@ais523 Conveniently the fix saved me a couple of bytes! – Neil – 2017-05-21T18:43:41.053