Parse a list of signed unary numbers

16

1

Unary numbers typically only represent nonnegative integers, but we can extend them to represent all integers as follows:

  • A positive integer N is represented as N 1's: 5 -> 11111
  • A negative integer -N is represented as a 0 followed by N 1's: -5 -> 011111
  • Zero is represented as 0

We can then represent a list of these numbers unambiguously if we use 0 as the separator:

3,-2,0,1
111,011,0,1
111 0 011 0 0 0 1
11100110001

Your task: take a string representing such a list of signed unary numbers, and translate it into a list of decimal numbers.

Details

You may assume that the input is a complete list of signed unary numbers. In particular, your program will not have to handle 1) empty input or 2) input that ends with a separator.

You may assume that the magnitude of each number will not exceed 127. For languages with maximum sizes of strings or lists, you may assume that the input and output will fit in your language's data structures, but your algorithm should theoretically work for a list of any size.

Your program or function may perform I/O in any of the standard ways. Input may be a string or a list of characters, single-character strings, integers, or booleans. You may use any two characters to represent 1 and 0; if you don't use 1 and 0, please specify which characters you're using.

Output must be decimal numbers in any reasonable list format (in particular, there must be some kind of a separator between numbers). Negative numbers should be indicated with a minus sign, although if your language has a different format for negative integers I will also accept that. Zero may be represented in the output as 0 or -0.

Test cases

1 -> 1
0 -> 0 (or -0, and similarly for the other test cases)
011 -> -2
1101 -> 2,1
1100 -> 2,0
11001 -> 2,-1
110001 -> 2,0,1
11100110001 -> 3,-2,0,1
00000001 -> 0,0,0,-1
01111011111111001111111111111110111111111111111100111111111111111111111110111111111111111111111111111111111111111111 -> -4,8,-15,16,-23,42
01111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 -> -127

DLosc

Posted 2017-12-16T06:12:10.290

Reputation: 21 213

2Nitpick: Since this contains '0's, it isn't technically unary. Good challenge though! – James – 2017-12-16T07:16:56.343

4@DJMcMayhem Nitpick to the nitpick: I technically never said it was unary. It is an extension of unary which I am calling "signed unary." ;) – DLosc – 2017-12-16T07:23:16.370

@DJMcMayhem IMO, the challenge is specifically that the separator (0) and the negative sign prefix (0) are the same, although it's still unambiguous, since you can't have negative signs in the middle of a number (is 182--693-1 a number? no, and neither is 1111011000101111 for the exact same reason). – Erik the Outgolfer – 2017-12-16T09:35:30.987

Is it okay if the outputted list is in reverse order of the input? – James – 2017-12-16T09:49:43.783

well technically decimal isn't decimal either since it uses the ' - ' symbol – Unlambder – 2017-12-16T10:45:04.917

@DJMcMayhem No, it should be the same list and therefore in the same order. – DLosc – 2017-12-16T18:17:34.363

Doesn't input of 0 end in a separator? – Giuseppe – 2017-12-16T20:26:08.593

@Giuseppe No, for the same reason that input of 011 doesn't begin with a separator: 0 is not considered a separator when it's part of a number. – DLosc – 2017-12-16T21:23:48.320

Answers

10

Python 2, 73 70 bytes

A function that takes a string as input and returns a string representation of a Python list. Zero can be represented both by 0 and -0 (when it comes last):

lambda s:`map(len,s.split('0'))`.replace('0, ','-').replace('--','0,')

Explanation

  1. split the input string s on zeroes.
  2. Take the length of each string in the resulting list (using map).

That takes us a long way. Zeroes were separators after all. And the numbers were unary, so len conveniently converts those to decimal. But now we've messed up all the non-separator uses of 0. Luckily, all non-separator uses were leading zeroes so they came after a separator-zero and gave us zero-length strings ('00'.split('0') == ['', '', '']). Those zero-length strings then also became 0 because of the len.

  1. Turn the list into a string (using "reverse quotes"), so we can fix up the mess more easily.
  2. replace each zero that precedes another number by a negative sign on that number instead. That fixes the use of 0 as a sign but it breaks the literal zeroes. Literal zeroes were also preceded by a separator, so they've now become pairs of extra dashes on the next number.
  3. replace each -- back into a 0 element in the "list".

mercator

Posted 2017-12-16T06:12:10.290

Reputation: 359

1Welcome to PPCG! – Steadybox – 2017-12-17T00:15:34.360

This is a really creative approach! You may want to add a short explanation so that those who don't speak Python can appreciate your answer too. – DLosc – 2017-12-17T00:16:14.820

@DLosc, thanks, I didn't know about the backtick. Wordy explanation also added. – mercator – 2017-12-18T22:00:41.393

8

Retina, 23 21 bytes

(.)0
$1 
01
-1
1+
$.&

Try it online!

The first stage (.)0<newline>$1<space> matches any character followed by a 0. The match is replaced by the first character followed by a space. This splits the string in the individual numbers.

The second stage 01<newline>-1 replaces 0's before a block of 1's to to the - sign.

The last stage 1+<newline>$.& matches all blocks of 1's and replaces them with the length of the group.

Here is an example with the output of the individual stages.

ovs

Posted 2017-12-16T06:12:10.290

Reputation: 21 408

Very nice - all of my ideas seem to clock in at 24 bytes... – Neil – 2017-12-16T11:04:10.250

1Could you please add an explanation? I don't speak Retina. – Daniel – 2017-12-17T15:04:35.317

@Dopapp added an explanation – ovs – 2017-12-17T17:50:40.790

7

Vim, 56 bytes

:s/\v(0?1*)0?/\1\r/g|%s/0/-/|%s/1*$/\=len(submatch(0))
D

Try it online!

I haven't posted in vim in a while. I'm mostly using vim because V is a pain sometimes. Because the count command, which is perfect for getting the number of '1's on the line will overwrite any '0's on the line, so we can't negate it afterwards.

Explanation:

This is one byte shorter then the straightforward way:

:s/\v(0?1*)0?/\1\r/g
:%s/0/-
:%s/1*$/\=len(submatch(0))
D

due to command chaining. Since that one separates the commands, I'll use it for the explanation.

:s/                     " Substitute
                        " Search for...
   \v                   "   Enable 'magic'. This determines whether certain atoms require a backslash or not.
                        "   Without it we would have: '\(0\?1*\)0\?', which is 2 bytes longer
      0?                "   An optional 0
        1*              "   Followed by any number of '1's
     (    )             "   (call that group 1)
           0?           "   Followed by another optional 0
             /          " Replace it with...
              \1        "   Subgroup 1
                \r      "   A newline
                  /g    " Do this for every match on the current line.

Now, each signed unary number is on an individual line. Using '11100110001' as an example, at this point we'll have:

111
011
0
1

:%s/0   " Replace every 0
     /- " With a dash  

:%s/1*$/                    " Replace every run of 1's at the end of a line
        \=len(submatch(0))  " With the length of said run

Since we added newlines at the end of each match, we had an empty line before running that. After running that, we'll have a '0' (because it matched a run of 0 '1's). So we just call D to delete this line, leaving it blank

James

Posted 2017-12-16T06:12:10.290

Reputation: 54 537

Ugh. :%s/1+$/ would get you one byte shorter if not for the need to backslash the + :( – NieDzejkob – 2017-12-16T08:02:24.853

@NieDzejkob I don't understand why that would be shorter. And also, that would give - instead of 0 or -0 – James – 2017-12-16T08:04:51.853

I wanted to eliminate the last line that way :P, never mind. – NieDzejkob – 2017-12-16T08:07:34.267

7

Haskell, 68 66 bytes

f(x:r)|(a,b)<-span(>0)r=([(0-),(1+)]!!x$sum a):[z|_:t<-[b],z<-f t]

Try it online! Takes input as a list of zeros and ones. Example usage: f [0,0,0,1,1] yields [0,-2].

Explanation:

The pattern matching in f(x:r)|(a,b)<-span(>0)r binds x to the first element of the input, a to a (potentially empty) list of following 1s, and b to the rest of the input. Given an input [0,1,1,1,0,0,1], we get x=0, a=[1,1,1] and b=[0,0,1].

The current number is then either the sum of a negated if x=0, or the sum of a plus one if x=1. This is achieved by indexing with x into a list containing a negation and increment function, and applying the resulting function to the sum of a: [(0-),(1+)]!!x$sum a.

The rest list b is either empty or contains a separating zero and the next number. The list comprehension [z|_:t<-[b],z<-f t] tries to match b on the pattern _:t, that is forget the head element and bind the rest of the list to t. If b is empty this match fails and the list comprehension evaluates to [], which is the base case for the recursion. Otherwise the function f is recursively applied to t and the list comprehension evaluates to all elements z from the result of f t.

Laikoni

Posted 2017-12-16T06:12:10.290

Reputation: 23 676

3

Wolfram Language (Mathematica), 80 bytes

StringCases[#<>"0",x_~~Shortest@y___~~"0":>(If[x=="0",-#,#+1]&)@StringLength@y]&

Try it online!

Abuses the mechanic of StringCases, since it does not check overlapping patterns. Since we search from left to right, without overlaps, we always get only the integers we need.

Explanation

#<>"0"

Append a zero at the end

StringCases

Find all of the following pattern...

x_~~Shortest@y___~~"0"

A single character (call it x), followed by the shortest possible zero-length or longer string (call it y), followed by a zero.

(If[x=="0",-#,#+1]&)@StringLength@y

Apply to matching pattern: take the length of y. If x is zero, then negate the value. Else, increment one.

This covers 00 as well, since y would be an empty string, and we would compute -0 (== 0).

JungHwan Min

Posted 2017-12-16T06:12:10.290

Reputation: 13 290

3

Brain-Flak, 94 (70?) bytes

([]){{}({(<()>)()}{}(<>)<>{<([{}]<>{})><>})
{{}<>([{}])(<>)}{}{}([])}{}<>([]){{}({}<>)<>([])}<>

Try it online!

This is actually surprisingly terse for brain-flak.

Here is a commented/readable version:

([])

{

    #Pop the Stack height
    {}

    (
        #If there isn't a leading 0, evaluate to 1...
        {
            (<()>)

            ()
        }

        #Pop the 0
        {}

        #Push a 0 onto the alternate stack
        (<>)
        <>

        #Run of '1's
        {
            #Decrement the alternate stack
            <([{}]<>{})>
            <>
        }

        #And push it here
    )

    #Was there a not leading 0?

    {
        {}

        #Invert the value on the alternate stack
        <>([{}])(<>)
    }

    #Pop 2 zeros
    {}{}


    ([])

}{}<>

#Push stack height
([])

#Reverse the stack
{

    {}

    ({}<>)

    <>([])

}<>

If the output can be in reverse, we can do this for 70 instead:

([]){{}({(<()>)()}{}(<>)<>{<([{}]<>{})><>}){{}<>([{}])(<>)}{}{}([])}<>

This tip of mine is almost perfect for this situation. But it doesn't quite work since we have to push a 0 before doing the operation (counting the '1's), and the operation happens in a loop. The shortest I could come up with utilizing this tip is:

([]){{}({(<()>)()}{}(<>)<>{<([{}]<>{})><>})
{{}<>([{}])(<>)}{}{}(<>())<>([])}{}<>{{}({}<>)<>}<>

which is also 94 bytes.

James

Posted 2017-12-16T06:12:10.290

Reputation: 54 537

3

Acc!!, 252 237 bytes

N
Count i while _/48 {
Count n while 48/_ {
Write 45
50+N
}
_+49/_*50
Count u while _%50/49 {
_+100-_%50+N
}
_/50-1
Count h while _/200 {
Write _/200+48
_%200+1
}
Count t while _/20+_%2 {
Write _/20+48
_%20-_%2
}
Write _/2+48
Write 9
N
}

Uses -0. Outputs numbers separated by tab characters, with a trailing tab. Try it online!

Amount of time writing the actual algorithm: 20 minutes. Amount of time debugging my decimal output code: 45 minutes. :^P

With comments

I don't know if these comments explain the code very well--they're based on my notes to myself while I was writing it, so they assume some understanding of how Acc!! works. If anything needs more explanation, let me know and I'll try to make it clearer.

# We partition the accumulator _ as [number][flag][nextchar]
# [flag] is a 2-value slot and [nextchar] a 50-value slot
# So [nextchar] is _%50, [flag] is _/50%2, [number] is _/100
# [flag] is 1 if we're in the middle of reading a number, 0 if we're between numbers
# It is also used for outputting as decimal (see below)
# Possible input characters are 0, 1, and newline, so [nextchar] is 48, 49, or 10

# Read the first character
N
# Loop while the character we just read is 0 or 1 and not newline
Count i while _/48 {
  # What we do in the loop depends on the combination of [flag] and [nextchar]:
  # 0,48 (start of number, read 0) => write minus sign, [flag] = 1, read another char
  # _,49 (read 1) => increment [number], [flag] = 1, read another char
  # 1,48 (middle of number, read 0) => write/clear [number], status = 0, read another
  #      char
  # 1,10 (middle of number, read <cr>) => ditto; the next read will be 0 for eof, which
  #      means the acc will be less than 48 and exit the loop

  # Process leading 0, if any
  Count n while 48/_ {
    # acc is 48: i.e. [number] is 0, [flag] is 0, [nextchar] is 48 (representing a 0)
    # Output minus sign
    Write 45
    # Set [flag] to 1 (thereby exiting loop) and read [nextchar]
    50+N
  }
  # If number starts with 1, then we didn't do the previous loop and [flag] is not set
  # In this case, acc is 49, so we add (50 if acc <= 49) to set [flag]
  _+49/_*50

  # Process a run of 1's
  Count u while _%50/49 {
    # [nextchar] is 49 (representing a 1)
    # Increment [number] and read another
    _+100-_%50+N
  }

  # At this stage, we know that we're at the end of a number, so write it as decimal
  # This is "easier" (ha) because the number has at most three digits
  # We shift our partitioning to [number][flag] and set [flag] to 0
  _/50-1

  # Output hundreds digit if nonzero
  # Since [number] is _/2, the hundreds digit is _/200
  Count h while _/200 {
    Write _/200+48
    # Mod 200 leaves only tens and units; also, set [flag] to 1
    _%200+1
  }
  # Output tens digit (_/20) if nonzero OR if there was a hundreds digit
  # In the latter case, [flag] is 1
  Count t while _/20+_%2 {
    Write _/20+48
    # Mod 20 leaves only units; clear [flag] if it was set
    _%20-_%2
  }
  # Write units unconditionally
  Write _/2+48

  # Write a tab for the separator
  Write 9
  # Read another character
  N
}

DLosc

Posted 2017-12-16T06:12:10.290

Reputation: 21 213

3

Perl 5, 40 + 1 (-n) = 41 bytes

print'-'x/00/.y/1//.$"for"0$_"=~/00?1*/g

Try it online!

Xcali

Posted 2017-12-16T06:12:10.290

Reputation: 7 671

3

Husk, 20 18 17 15 14 bytes

Γ~:?Σṁ_Πȯ₀tΣġ/

Try it online!

Explanation

Γ~:?Σṁ_Πȯ₀tΣġ/  Input is a list, say x = [0,1,1,0,0,0,1,1]
            ġ   Group by
             /  division.
                This splits x right before each 0: [[0,1,1],[0],[0],[0,1,1]]
Γ               Deconstruct into head y = [0,1,1] and tail z = [[0],[0],[0,1,1]]
   ?Σṁ_Π        Apply to y:
       Π         Product: 0
   ?Σ            If that is nonzero, take sum of y,
     ṁ_          else take sum of negated elements of y: u = -2
        ȯ₀tΣ    Apply to z:
           Σ     Concatenate: [0,0,0,1,1]
          t      Drop first element: [0,0,1,1]
         ₀       Recurse: [0,2]
 ~:             Tack u to the front: [-2,0,2]

The splitting works like this. ġ/ splits its argument between each pair of elements a,b for which /a b is falsy. /a b is division with flipped arguments, so b divided by a. The relevant values in this program are these:

  • /1 1 gives 1 (truthy).
  • /1 0 gives 0 (falsy).
  • /0 1 gives Inf (positive infinity, truthy).
  • /0 0 gives Any (a special NaN-like value, falsy).

Zgarb

Posted 2017-12-16T06:12:10.290

Reputation: 39 083

2

Python 2, 96 92 bytes

lambda s:[[len(t),-len(t)+1]['1'>t]for t in s.replace('10','1 ').replace('00','0 ').split()]

Try it online!

Thx to ovs and DLosc for 2 bytes each.

Chas Brown

Posted 2017-12-16T06:12:10.290

Reputation: 8 959

2

R, 119 bytes

function(x){n=nchar
y=regmatches(x,regexec("(0?)(1*)0?([01]*)",x))[[1]]
cat((-1)^n(y[2])*n(y[3]),"")
if(y[4]>0)f(y[4])}

Try it online!

The code uses this solution from stackoverflow for a related problem (Thanks to jeales for the idea). The output is a space-separated string printed to stdout.

NofP

Posted 2017-12-16T06:12:10.290

Reputation: 754

2

Jelly,  19  18 bytes

There must be a better way...

®ḢN$Ḣ©?ṄEȧ
ṣ0L€ÇL¿

A full program printing each number followed by a linefeed.

Try it online!

How?

®ḢN$Ḣ©?ṄEȧ - Link 1, print first number and yield next input: list of numbers, X
           -                              e.g. [8,0,15,16,...] or [0,4,8,0,15,16,...]
      ?    - if...
    Ḣ      - condition: yield head and modify  8([0,15,16,...])   0([4,8,0,15,16,...])  
     ©     -            (copy to register)     8                  0
®          - then: recall from the register    8
   $       - else: last two links as a monad:
 Ḣ         -         yield head and modify                        4([8,0,15,16,...])
  N                  negate                                      -4
       Ṅ   - print that and yield it           8                 -4
        E  - all equal (to get 0 to be truthy) 1                  1
         ȧ - AND the (modified) input          [0,15,16,...]      [8,0,15,16,...]
           -   (ready to be the input for the next call to this link)

ṣ0L€ÇL¿ - Main link: list e.g. [0,1,0,0,0,0,1,1]
ṣ0      - split at zeros       [[],[1],[],[],[],[1,1]
  L€    - length of €ach       [0,1,0,0,0,2]
      ¿ - while...
     L  - condition: length                           1  1  1  0  ([0,1,0,0,0,2], [0,0,0,2], [0,2], [])
    Ç   - action: call the last link (1) as a monad  -1  0 -2     ( - 1            - 0        - 2)

Jonathan Allan

Posted 2017-12-16T06:12:10.290

Reputation: 67 804

1

QBasic, 88 86 bytes

1u$=INPUT$(1)
z=u$<"1
IF n*z THEN?(1-2*s)*(n-s):s=0:n=0ELSE s=s-z:n=n+1
IF"!"<u$GOTO 1

This was fun. Multiple revisions starting from a 107-byte version resulted in one of the most obfuscated bits of QBasic I think I've ever written. (Edit: Oddly enough, I was able to golf 2 bytes by making the code clearer.)

Note: this program reads user input one character at a time without echoing it to the screen (a result of using INPUT$(1) instead of the usual INPUT statement). So as you're typing, you won't see the 1's and 0's, but the decimal numbers will appear as they are calculated. Make sure to hit Enter at the end of the input to see the last number and end the program.

Ungolfed version

sign = 0
num = 0
DO
  digit$ = INPUT$(1)
  isZero = (digit$ < "1")
  IF num > 0 AND isZero THEN
    PRINT (1 - 2 * sign) * (num - sign)
    sign = 0
    num = 0
  ELSE
    IF isZero THEN sign = 1
    num = num + 1
  END IF
LOOP WHILE "!" < digit$

Explanation

(AKA "What?? That still makes no sense!")

The base strategy is to run a loop that grabs one character from INPUT$(1) each time through, does stuff with it, and keeps looping as long as the character has an ASCII value greater than that of ! (i.e., wasn't a newline).

We keep track of numbers-in-progress using two variables. num is the number of characters in the current signed unary number (including any leading zero). sign is 1 if the number had a leading zero, 0 if not. Both of these need to be initialized to 0, which is great for the golfed version because numeric variables in QBasic are automatically initialized to 0.

Whenever we read a character, the first thing is to determine whether it's 1 or 0. We'll use this result twice, so we store it in isZero. Technically, that name is misleading, since the value will also be truthy if the character is a newline. Note that truthy in QBasic is -1 and falsey is 0.

Now, if we're in the middle of reading a number (num > 0) and we hit a zero or end of input (isZero), we need to calculate which number we've finished reading.

  • sign stores 0 for positive, 1 for negative. To get 1 for positive and -1 for negative, we need 1-2*sign.
  • num stores the correct magnitude for positives but one more than the magnitude for negatives (since it includes the sign marker). So we can use num-sign for the magnitude.

Multiply these together and print; then reset sign and num to 0 in preparation for reading the next number.

Otherwise (if we haven't hit a zero, or if we've hit a zero at the beginning of a number), we update sign and num as follows:

  • sign becomes 1 if we're looking at a leading zero; otherwise, if we're looking at a one, it stays at whatever it already was. The golfed code is s=s-z, which amounts to the same thing:
    • If this is a leading zero, z is -1. Since s is guaranteed to be 0 (because this is the start of a new number), s-z will be 1.
    • If this is a one, z is 0. Then s-z stays at whatever value s had previously.
  • num is incremented.

That's it!

DLosc

Posted 2017-12-16T06:12:10.290

Reputation: 21 213

0

JavaScript (ES6), 60 bytes

Returns a space-separated list of integers.

s=>(0+s).replace(/00?1*/g,s=>(l=s.length,+s[1]?l-1:2-l)+' ')

Test cases

let f =

s=>(0+s).replace(/00?1*/g,s=>(l=s.length,+s[1]?l-1:2-l)+' ')

console.log(f("1")) // 1
console.log(f("0")) // 0
console.log(f("011")) // -2
console.log(f("1101")) // 2,1
console.log(f("1100")) // 2,0
console.log(f("11001")) // 2,-1
console.log(f("110001")) // 2,0,1
console.log(f("11100110001")) // 3,-2,0,1
console.log(f("00000001")) // 0,0,0,-1
console.log(f("01111011111111001111111111111110111111111111111100111111111111111111111110111111111111111111111111111111111111111111")) // -4,8,-15,16,-23,42
console.log(f("01111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111")) // -127

Arnauld

Posted 2017-12-16T06:12:10.290

Reputation: 111 334

0

Lua, 58 bytes

(...):gsub("(0?)(1*)0?",function(s,n)print(#n-2*#s*#n)end)

Try it online!

Full program, takes input from the command line and prints the numbers to stdout separated by newlines.

Jonathan S.

Posted 2017-12-16T06:12:10.290

Reputation: 423