Expand tabs (implement expand(1))

10

2

Your task this time is to implement a variant of the POSIX expand(1) utility which expands tabs to spaces.

Your program is to take a tabstop specification and then read input on standard in and replace tab characters in the input with the appropriate amount of spaces to reach the next tabstop. The result should be written to standard out.

Tabstop specification

A tabstop specification consists of either a single number, or a comma-separated list of tabstops. In the case of a single number, it is repeated as if multiples of it occurred in a comma-separated list (i.e. 4 acts as 4,8,12,16,20,...). Each entry in a comma-separated list is a positive integer optionally prefixed by a +. A + prefix indicates a relative difference to the previous value in the comma-separated list. The first value in the list must be absolute (i.e. unprefixed). The tabstops specify the column of the next non-space character (following the expanded tab), with the leftmost column taken as number 0. Tabs should always expand to at least one space.

Input/output

The tabstop specification is either to be taken as the first command-line parameter to the program, or read from standard in as the first line of input (terminated by a newline), at your discretion. After the tabstop has been read, the remaining input (all input, in the former case) until EOF is to be processed and expanded. The expanded output shall be written to standard out.

All expanded tabstops, and all input, is assumed to be a maximum of 80 columns wide. All expanded tabstops are strictly increasing.


Example

Tabstop specification 4,6,+2,+8 is equivalent to 4,6,8,16, and with both the input

ab<Tab>c
<Tab><Tab>d<Tab>e<Tab>f

is expanded into ( indicates a space)

ab␣␣c
␣␣␣␣␣␣d␣e␣␣␣␣␣␣␣f

01234567890123456   (Ruler for the above, not part of the output)
          1111111

Scoring is pure ; shortest code wins.

FireFly

Posted 2014-01-20T19:55:18.023

Reputation: 7 107

Answers

2

GolfScript (77 75 chars)

n/(','/{'+'/{~t++}*~:t}%81,{t*}%+:T;{[0\{.9={;T{1$>}?(.)@-' '*}*\)}/;]n+}/;

I'm quite pleased with the tabspec parsing.

# Split on commas
','/
# For each element:
{
    # Split on '+'
    '+'/
    # We now have either ["val"] or ["" "val"]
    # The clever bit: fold
    # Folding a block over a one-element array gives that element, so ["val"] => "val"
    # Folding a block over a two-element array puts both elements on the stack and executes,
    # so ["" "val"]{~t++}* evaluates as
    #     "" "val" ~t++
    # which evaluates val, adds the previous value, and concatenates with that empty string
    {~t++}*
    # Either way we now have a string containing one value. Eval it and assign to t
    ~:t
}%

Then I add multiples of the last element until I'm guaranteed to have enough to reach the end of the 80 columns:

81,{t*}%+

This gives the desired behaviour when only one tabstop was specified, and is otherwise only relevant in cases which the spec doesn't mention. (NB it makes the list of tab-stops dip back to 0 and then repeat the last parsed element, but that's irrelevant because when it comes to using the list I look for the first element greater than the current position).

The rest is pretty straightforward.

Peter Taylor

Posted 2014-01-20T19:55:18.023

Reputation: 41 901

2

Ruby 161 145

Reads the tabstop specification on the first line of input.

i=t=[]
gets.scan(/(\+)?(\d+)/){t<<i=$2.to_i+($1?i:0)}
81.times{|j|t<<j*i}
while gets
$_.sub!$&," "*(t.find{|s|s>i=$`.size}-i)while~/\t/
print
end

edit: Added two lines that makes the last read tabstop repeat so that tabstop specifications of a single number also works correctly

i is a temporary variable for holding the last parsed tabstop. t is the list of tabstobs, parsed from the gets.scan line. For good measure we add 81 multiples of the last parsed tabstop. the while gets loop keeps going until there is no more input. For each line of input we substitute tabs for spaces, one tab at the time because the string moves as we add the spaces and we must recalculate the correct tabstop.

daniero

Posted 2014-01-20T19:55:18.023

Reputation: 17 193

I don’t really know Ruby, but can you write x+($1?i:0) as the shorter $1?x+i:x? – Timwi – 2014-01-25T00:42:28.483

@Timwi Nope! Ruby is a little strange with the ternary operator. Usually you need to put a space in there somewhere, because the colon (:) could also mark the beginning of a symbol, but since a symbol can't start with a digit, :0 is OK without space. Or something. It's weird. The parentheses are crucial also it seems.

– daniero – 2014-01-25T03:44:21.247

That tabstop scanning looks buggy to me. In t<<x+($1?i:0);i=x the first statement doesn't change x, does it? I think you need to reverse it as i=x+($1?i:0);t<<i – Peter Taylor – 2014-01-25T10:12:32.983

1In fact you can save 16 by replacing the first two lines with i=t=[] (since i is guaranteed not to be needed the first time around); simplifying the tab-stop parse to {t<<i=$2.to_i+($1?i:0)}, and eliminating l entirely (i already holds that value). But nice one on not caring about the tab stop being strictly increasing: that saves you 4 chars, and I can borrow it to save 2. – Peter Taylor – 2014-01-25T14:06:23.547

@PeterTaylor Thanks for the input! It wasn't directly buggy, but certainly a little bloated. I find it too easy to stare oneself blind on code like this. – daniero – 2014-01-25T18:20:16.567

It was directly buggy: given input of 4,6,+2,+8 it produced tab-stops 4,6,8,10 rather than 4,6,8,16. – Peter Taylor – 2014-01-25T19:47:26.067

@PeterTaylor Wow, I did actually at some point fix exactly that issue; Don't know how it got back in there. Probably during some golfing process. Good catch! – daniero – 2014-01-25T20:30:45.960

1

C, 228 chars

Here is a C solution to start things off. There's still a lot of golfing to do here (look at all those ifs and fors and putchars...). Tested with the example testcase, as well as with the same input but 4 and 8 for the tab spec.

S[99],i;L,C;main(v){for(v=1;v;)v=scanf("+%d",&C),v=v>0?C+=L:scanf("%d",&C),
v&&(S[L=C]=++i,getchar());for(;i==1&&C<80;)S[C+=L]=1;for(C=L=0;C=~getchar();)
if(C+10)putchar(~C),L+=C+11?1:-L;else for(putchar(32);!S[++L];)putchar(32);}

FireFly

Posted 2014-01-20T19:55:18.023

Reputation: 7 107